17-03-23 05:54 PM
Hi there,
How to remove duplicate values from a collection. Let's say we have two column ( ID & NAME), if one id is having duplicate value then Bot should remove the complete row.
Previously there are some discussion over community for this, but I didn't find any right solution.
Kindly advise.
17-03-23 06:56 PM
Hi Madhu Garg,
It seems you are looking to eliminate the entire row basis on one Column Value for example see below collection - you would want to remove the entire row basis on the ID if you find that ID duplicate and doesn't matter what data comes after that ID duplication.
These can be easily done via Code Stage as below - First we need to Sort the Data (Collection Manipulation Object Action) you can find the VBO here if you don't have it : https://digitalexchange.blueprism.com/dx/entry/3439/solution/utility---collection-manipulation
Then call the custom code to remove the entire row basis on the column you want to check it for. For creating a custom code - You can add an action in one of your extended objects for collection manipulations, make sure you have correct external references for Dll and their Namespaces listed in Initialise stage, please follow below approach:
Code for Custom Action : Add an action in existing collection manipulation extended objects or create a new object if not already. Add a Code Stage.
Set Inputs as : Collection Name and Column Name and Set Output.
Write the Code in Code Stage:
System.GC.Collect()
'filteredCollection=RawData.clone()
'filteredCollection=RawData.DefaultView.ToTable(true, columnToCheck)
dim count as integer=rawData.rows.count-1
dim i as integer =0
dim rowIndex as integer
for rowIndex = rawData.rows.count-2 to 0 step rowIndex-1
if rawData.rows(rowIndex).item(columnToCheck1)=rawData.rows(rowIndex+1).item(columnToCheck1)
rawData.rows(rowIndex+1).delete
end if
next
filteredCollection=rawData.copy()
Publish and move back to process and Sort your collection which needs duplication removal and call the above action.
Results:
Cases where only single column values are to be checked for duplication row elimination - you can follow above approach for for multiple column check there are additional threads where I have said about duplication elimination depending on number of columns, if you have more than two or three columns validation for duplication elimination, you can refer that as well and observe how code changes - the code just adds few more conditions to check pretty much straight forward- refer https://community.blueprism.com/discussion/remove-duplicate-rows-based-on-3-column-names?ReturnUrl=%2fcontent%2fallrecentposts
------------------------------
Kindly up vote this as "Best Answer" if it adds value or resolves your query in anyway possible, happy to help.
Regards,
Mukesh Kumar - Senior Automation Developer
NHS England, United Kingdom, GB
------------------------------
05-01-24 03:13 PM
Great suggestion, worked perfectly. I personally had to tweak the code because I wanted to change the name of the Inputs and Outputs - very useful VBO addition.
29-04-24 04:17 PM
Hi BP team, can we please get this action added to the native Utilty-Collection VBO as this is very common scenario where the user wants to delete the duplicate rows and should be available in out-of-box utility.