cancel
Showing results for 
Search instead for 
Did you mean: 

Remove Duplicates from excel

NupurSood
Level 5
Hello All

Am looking a a code that is able to remove all duplicates from excel directly. I do not want to take the data in a collection and then remove duplicates because of huge volume. The de-duping must happen on the basis of all columns of excel and not just one defined column.

Thanks in advance for your guidance

Regards
Nupur

------------------------------
Nupur Sood
Research Associate
S&P
Asia/Kolkata
------------------------------
31 REPLIES 31

ewilson
Staff
Staff
Are you writing your own code, trying to automate Excel via the UI, or working with one of the Excel VBOs on the DX? There is a deduplication method on the Range object in VBA. You can find more about it here.

Unfortunately, this feature isn't included in the Excel VBOs provided by BP at the moment, but with the Microsoft documentation referenced above you should be able to add it to your local VBO.

Cheers,

------------------------------
Eric Wilson
Director, Partner Integrations for Digital Exchange
Blue Prism
------------------------------

Am trying to utilize one of the excel VBOs itself and add a page to it that removes dupes by taking into consideration all the columns of excel. 
My challenge is converting "Columns:=Array(1,2)" to the format that VB would accept. I keep running into type mismatch error.

Thanks 
Nupur


------------------------------
Nupur Sood
Research Associate
S&P
Asia/Kolkata
------------------------------

Can you share a screenshot of your code or copy-and-paste it in the thread?

Cheers,

------------------------------
Eric Wilson
Director, Partner Integrations for Digital Exchange
Blue Prism
------------------------------

Hello, Nupur,

how are you :)? 

If you need to consider all columns and you are familiar with C# (may be there is an alternative in VBA) then this one line code does what you need 😉

Output_Collection = Input_Collection.DefaultView.ToTable( /*distinct*/ true);

The input parameter to the action is the collection - you get it from Excel via Get Worksheet (I know you know it) - and output collection is only unique rows.

Regards,

------------------------------
Zdeněk Kabátek
Head of Professional Services
NEOOPS
http://www.neoops.com/
Europe/Prague
------------------------------

Hello Zdenek 

Very well, hoping the same for you 🙂 And thank you for coming to rescue.
 Am afraid that the collection approach does not work for me since the amount of data in my excel is huge (~50,000 rows) and getting data into collection freezes up my  BP. Am trying to achieve the same directly in excel instead. 

Regards

------------------------------
Nupur Sood
Research Associate
S&P
Asia/Kolkata
------------------------------

@NupurSood another way you can achieve this since you're digital worker will need a running instance of Excel is to simply attach to that instance and perform the deduplication via the Excel UI which is detailed in the following article:

https://support.microsoft.com/en-us/office/find-and-remove-duplicates-00e35bea-b46a-4d5d-b28e-66a552dc138d

Cheers,


------------------------------
Eric Wilson
Director, Partner Integrations for Digital Exchange
Blue Prism
------------------------------

Thank You for your response. Below is the code am using for reference and the challenge is to pass array of column numbers instead of just one column in the highlighted portion

26199.png


------------------------------
Nupur Sood
Research Associate
S&P
Asia/Kolkata
------------------------------

Thank You 

I can adopt this approach but I was hoping to do something directly via code stage to be able to have more optimal/robust solution. If I am unable to crack the code, this would be helpful. 
Thanks again for all your guidance. 

Regards

------------------------------
Nupur Sood
Research Associate
S&P
Asia/Kolkata
------------------------------

Hey Nupur,

You can achieve this using Code stage and have to write custom code along adding related dll and namespace.

Thanks
Nilesh

------------------------------
Nilesh Jadhav
Senior RPA Specialist
ADP
India
------------------------------
Nilesh Jadhav.
Consultant
ADP,India