cancel
Showing results for 
Search instead for 
Did you mean: 

Filter Unique Values Collection

SebT
Level 4
Hi,

I have a collection containing 500 rows from which i wish to filter/sort/identify all duplicates based on values in one of the columns and move those duplicates to a separate collection. 

I believe this might be achieveable using the 'Filter Collection' action, or perhaps a code stage, however I am unaware of what syntax to use.

Any help would be greatly appreciated.

Thanks
Br,
Sebastian
1 BEST ANSWER

Best Answers

Hi Seb,

We got confused there for a while probably from the heading of the post. However, knowing your requirement I have come up with a different approach altogether where we can use a LINQ object to get the duplicate records. You can create a new business object and add the below External References ('System.Data.DataSetExtensions.dll', 'System.Core.dll') and Namespace Imports ('System.Data.DataSetExtensions', 'System.LINQ') on the Page Description stage of your Initialize action for the LINQ queries to work properly. Also, ensure that the language is selected as 'Visual Basic':

36193.png
Once you have the updated code options as shown above, create a new action named 'Get Duplicate Records' and pass two input arguments, Input Collection (Collection) and Field Name (Text). Based on the field name that you provide the duplicate records will be fetched from the Input Collection. Also set an Output parameter as Output Collection (Collection) for this action as shown below:

36194.png

Add the code stage and use the below code with the input and out arguments as show:

Output_Collection = (From row In Input_Collection _
Group row By a = row(Field_Name).Trim.ToString Into grp = Group _
Where grp.Count>1 _
Select grp.ToList).SelectMany(Function(x) x).CopyToDataTable()

36195.png36196.png
36197.png

The run results are as follows:

Input Arguments:

36198.png

Output Result:

36199.png


You can publish the action and test the same from Process Studio. Let us know if this helps you out 🙂
----------------------------------
Hope it helps you out and if my solution resolves your query, then please provide a big thumbs up so that the others members in the community having similar problem statement can track the answer easily in future.

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Technical Business Analyst,
WonderBotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------

View answer in original post

7 REPLIES 7

vinodchinthakin
Level 9
Hi Sebastian,

You can utilize this VBO from DX.
Use Action Distinct to get Unique Values.
Use Not in Operator Action to get Duplicate values. Pass Unique value collection as Input collection

Hi Seb T,

The fastest way to filter out unique values is to extend the Collection Manipulation VBO and you can add a new action named "Keep Unique Values". Pass one collection as an input parameter and pass another collection as an output parameter and use the below code in the code stage:

Output_Collection = Input_Collection.DefaultView.ToTable(True)

Please find the below solution for your reference:

36184.png
36185.png
36186.png36187.png
----------------------------------
Hope it helps you out and if my solution resolves your query, then please provide a big thumbs up so that the others members in the community having similar problem statement can track the answer easily in future.

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Technical Business Analyst,
WonderBotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------

SebT
Level 4
Hi @devneetmohanty07

Thank you for your answer. However, what I am trying to achieve is to get a collection containing only the duplicated values - not the unique values.

I am really sorry that I misunderstood the question a bit. Yes you were absolutely right, the solution I provided you gives you unique values based on all the fields not only one. I have modified my solution a bit for your requirement. Please find the updated solution as shown below:

Pass one collection as an input parameter along with a text data item parameter called "Field Name" and pass another collection as an output parameter and use the below code in the code stage:

Output_Collection = Input_Collection.DefaultView.ToTable(True,Field_Name)


36188.png
36189.png
36190.png
36191.png

Please let me know if it solves your query. Once you get the unique values you can perhaps iterate to find the duplicate ones by comparing this against the unique set of values.
----------------------------------
Hope it helps you out and if my solution resolves your query, then please provide a big thumbs up so that the others members in the community having similar problem statement can track the answer easily in future.

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Technical Business Analyst,
WonderBotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------

SebT
Level 4
@devneetmohanty07 @vinod chinthakindi

I might have phrased my question wrong. What I need is a collection containing all duplicated values that exists within a single collection. 

As an example, lets say I have the following collection with the following values. What I would like to have is a filter/code stage that would move or copy the values 1 & 3 (since they are duplicates) to a separate collection. Hope this makes sense.
 
36192.png
​​

Hi Seb,

We got confused there for a while probably from the heading of the post. However, knowing your requirement I have come up with a different approach altogether where we can use a LINQ object to get the duplicate records. You can create a new business object and add the below External References ('System.Data.DataSetExtensions.dll', 'System.Core.dll') and Namespace Imports ('System.Data.DataSetExtensions', 'System.LINQ') on the Page Description stage of your Initialize action for the LINQ queries to work properly. Also, ensure that the language is selected as 'Visual Basic':

36193.png
Once you have the updated code options as shown above, create a new action named 'Get Duplicate Records' and pass two input arguments, Input Collection (Collection) and Field Name (Text). Based on the field name that you provide the duplicate records will be fetched from the Input Collection. Also set an Output parameter as Output Collection (Collection) for this action as shown below:

36194.png

Add the code stage and use the below code with the input and out arguments as show:

Output_Collection = (From row In Input_Collection _
Group row By a = row(Field_Name).Trim.ToString Into grp = Group _
Where grp.Count>1 _
Select grp.ToList).SelectMany(Function(x) x).CopyToDataTable()

36195.png36196.png
36197.png

The run results are as follows:

Input Arguments:

36198.png

Output Result:

36199.png


You can publish the action and test the same from Process Studio. Let us know if this helps you out 🙂
----------------------------------
Hope it helps you out and if my solution resolves your query, then please provide a big thumbs up so that the others members in the community having similar problem statement can track the answer easily in future.

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Technical Business Analyst,
WonderBotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------

SebT
Level 4
@devneetmohanty07

Works like a charm! Thank you very much for your help and thorough explanation!