cancel
Showing results for 
Search instead for 
Did you mean: 

Duplicate Removal From Collections.

AmberKamboj
Level 2
Hi, I have been trying to remove duplicates from a excel file and then inserting them into work queue. Excel file has 4 fields with 1 unique field ( say Product Code ). File has almost 100+ entries. I don't want non-unique values to be inserted into work queue. Now what I am doing is using Business Object "Utility Collection Manipulation" and Action Stage "Collection Contains Values" to check every single item in the excel file and then removing the rows with non-unique value.  When I run this in Control Room, It works fine but takes more than 3 minutes to do so (depending on my logic and number of entries in excel file). I want a better solution to this which can solve this problem in less than 10 seconds. I have heard that there are somethings like dataview, datatable, Code Stage which could have been used here to do this quite quickly internally itself. Since I am very new to this technology, I don't know what to do. Please help! Thanks.
3 REPLIES 3

david.l.morris
Level 14
Change Stage logging to 'Errors only' for all the stages in the loop. It'll run in less than 10 seconds (or so) in Control Room after that.
Dave Morris 3Ci at Southern Company Atlanta, GA

AmiBarrett
Level 12

Here's a simple c# code stage that'll get you distincts.

Inputs:
Collection - Collection
Column - Text (Name of column)

Outputs:
Sorted Collection - Collection

DataView dv = Collection.DefaultView; 
Sorted_Collection = dv.ToTable(true, Column);

AmberKamboj
Level 2
thanks @david and @amitbarrett ... you guys really helped me out