cancel
Showing results for 
Search instead for 
Did you mean: 

Remove Duplicate In Collection

MadhuGarg
Level 4

Hi there,

How to remove duplicate values from a collection. Let's say we have two column ( ID & NAME), if one id is having duplicate value then Bot should remove the complete row.

Previously there are some discussion over community for this, but I didn't find any right solution.

Kindly advise.



------------------------------
Madhu Garg
------------------------------
2 REPLIES 2

Mukeshh_k
MVP

Hi Madhu Garg,

It seems you are looking to eliminate the entire row basis on one Column Value for example see below collection - you would want to remove the entire row basis on the ID if you find that ID duplicate and doesn't matter what data comes after that ID duplication. 

23113.png

These can be easily done via Code Stage as below - First we need to Sort the Data (Collection Manipulation Object Action) you can find the VBO here if you don't have it : https://digitalexchange.blueprism.com/dx/entry/3439/solution/utility---collection-manipulation

23114.png

Then call the custom code to remove the entire row basis on the column you want to check it for. For creating a custom code - You can add an action in one of your extended objects for collection manipulations, make sure you have correct external references for Dll and their Namespaces listed in Initialise stage, please follow below approach:

Code for Custom Action : Add an action in existing collection manipulation extended objects or create a new object if not already. Add a Code Stage.23115.png

Set Inputs as : Collection Name and Column Name and Set Output.

23116.png

23117.png

Write the Code in Code Stage:

System.GC.Collect()
'filteredCollection=RawData.clone()
'filteredCollection=RawData.DefaultView.ToTable(true, columnToCheck)
dim count as integer=rawData.rows.count-1
dim i as integer =0
dim rowIndex as integer
for  rowIndex = rawData.rows.count-2 to 0 step rowIndex-1
    if rawData.rows(rowIndex).item(columnToCheck1)=rawData.rows(rowIndex+1).item(columnToCheck1)

        rawData.rows(rowIndex+1).delete
    end if
next
filteredCollection=rawData.copy()

23118.png

Publish and move back to process and Sort your collection which needs duplication removal and call the above action.

23119.png

Results:

23120.png

Cases where only single column values are to be checked for duplication row elimination - you can follow above approach for for multiple column check there are additional threads where I have said about duplication elimination depending on number of columns, if you have more than two or three columns validation for duplication elimination, you can refer that as well and observe how code changes - the code just adds few more conditions to check  pretty much straight forward- refer https://community.blueprism.com/discussion/remove-duplicate-rows-based-on-3-column-names?ReturnUrl=%2fcontent%2fallrecentposts



------------------------------
Kindly up vote this as "Best Answer" if it adds value or resolves your query in anyway possible, happy to help.

Regards,

Mukesh Kumar - Senior Automation Developer

NHS England, United Kingdom, GB
------------------------------

Regards,

Mukesh Kumar

Great suggestion, worked perfectly.  I personally had to tweak the code because I wanted to change the name of the Inputs and Outputs - very useful VBO addition.



------------------------------
Louis Gilmour
------------------------------