cancel
Showing results for 
Search instead for 
Did you mean: 

Get Unique Values from more than one column in a collection

vinodchinthakin
Level 9
Hi

I have a scenario where I have to get unique values from more than one column in a collection. Right now I am achieving this using loops which is taking more execution time due to high number of records. I am looking for a using code stage to achieve similar thing or any other method with less execution time.

Example:
Input Collection

ID          Name    Email                                 Mobile

1            A            A@GMAIL.COM               91****

2            B            B@GMAIL.COM               91****

3            C            C@GMAIL.COM               91****

4            A            A@GMAIL.COM               91****

5            B            B@GMAIL.COM               91****

6            D            D@GMAIL.COM              91****


Expected Output Collection:

Name    Email                  

A            A@GMAIL.COM

B            B@GMAIL.COM

C            C@GMAIL.COM

D            D@GMAIL.COM

    

Edit:
Input collection can also has some Null Values in Email Column



------------------------------
vinod chinthakindi
------------------------------
8 REPLIES 8

RushabhDedhia
Level 4
Hey @vinod chinthakindi,

You can try the below code for removing duplicates - but ​always keep in mind, that whether it is C# code or any other programming language, it will always perform the loop function (for each). you can set the values accordingly as per your table name.

public DataTable RemoveDuplicateRows(DataTable dTable, string colName)
{
   Hashtable hTable = new Hashtable();
   ArrayList duplicateList = new ArrayList();

   //Add list of all the unique item value to hashtable, which stores combination of key, value pair.
   //And add duplicate item value in arraylist.
   foreach (DataRow drow in dTable.Rows)
   {
      if (hTable.Contains(drow[colName]))
         duplicateList.Add(drow);
      else
         hTable.Add(drow[colName], string.Empty); 
   }

   //Removing a list of duplicate items from datatable.
   foreach (DataRow dRow in duplicateList)
      dTable.Rows.Remove(dRow);

   //Datatable which contains unique records will be return as output.
      return dTable;
}

Thanks & Regards

------------------------------
Rushabh Dedhia
Founder,
Biznessology (https://www.linkedin.com/company/biznessology/)
+91 9428860307
------------------------------
Rushabh Dedhia Founder, Biznessology (https://www.linkedin.com/company/biznessology/) +91 9428860307

Hi Vinod,

You can extend the 'Collection Manipulation' business object and add the Namespace Imports: 'System.Collections.Generic' on the Page Description stage of your Initialize action for the LINQ queries to work properly. Also, ensure that the language is selected as 'Visual Basic'.


Once you have the updated code options as shown above, create a new action named 'Get Unique Values' and pass two input arguments, 'Input Collection' (Collection) and 'Field Names' (Collection) having a single column called 'Fields' of text data type. Based on the field name in each row of the Field Names collection unique rows will be fetched from the Input Collection. Also set an Output parameter as 'Output Collection' (Collection) for this action as shown below:

28062.png

Add the code stage and use the below code with the input and out arguments as show:

Dim listOfFields = New List(Of String)()

For Each row In Field_Names.Rows

	listOfFields.Add(row("Fields").ToString())

Next

Output_Collection = Input_Collection.DefaultView.ToTable(True,listOfFields.ToArray())


28063.png28064.png
28065.png

The run results are as follows:

Input Arguments:

28066.png

28067.png

Output Result:

28068.png

You can publish the action and test the same from Process Studio. Let us know if this helps you out 🙂

------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please mark it as the 'Best Answer' so that the others members in the community having similar problem statement can track the answer easily in future

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Sr. Consultant - Automation Developer,
WonderBotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------
------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please provide a big thumbs up so that the others members in the community having similar problem statement can track the answer easily in future.

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Technical Business Analyst,
WonderBotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------

Hi @devneetmohanty07

I tried your code but got an error saying "row is not declared".​

add the Namespace Imports: 'System.Collections.Generic' and added System.Linq.dll .with the language is selected as 'Visual Basic'.
Am I missing any specific dll for the above code?​

------------------------------
vinod chinthakindi
------------------------------

Hi Vinod,

You dont even require the LINQ dll for this. This only needs the namespace import 'System.Collections.Generic' to be added separately along with the default namespace imports that get generated automatically whenever you create any new business object. You can see the below screenshot for reference:

28081.png

The error you are getting might be due to the fact that you are not declaring the data type of the row variable in the for each statement. I do not need to do that sine I guess I am on Blue Prism v.6.10.4 which uses an updated .NET runtime environment.

You can try this updated code and check if it works:


Dim listOfFields = New List(Of String)()
 
For Each row As System.Data.DataRow In Field_Names.Rows
 
listOfFields.Add(row("Fields").ToString())
 
Next
 
Output_Collection = Input_Collection.DefaultView.ToTable(True,listOfFields.ToArray())


Here, I have just added the highlighted section of the code to declare the variable row as a DataRow() class type object.

------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please mark it as the 'Best Answer' so that the others members in the community having similar problem statement can track the answer easily in future

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Sr. Consultant - Automation Developer,
WonderBotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------
------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please provide a big thumbs up so that the others members in the community having similar problem statement can track the answer easily in future.

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Technical Business Analyst,
WonderBotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------

Hi @devneetmohanty07
I am using BP6.4.2, Above changes in code has resolved the compile error. But while execution it throws a following error. Any idea!

28094.png

------------------------------
vinod chinthakindi
------------------------------

Satish1414
Level 4
Hi Vinod ,

If you are using the Code stage use the dictionary with key as "Name" or if you want the combination of two columns you can use
"Id+Name" and you can avoid the duplicates from the output collection . Please find the screenshot for C# code for reference .

28103.png
Hope this helps.

Regards,
Satish Gunturi
Senior Consultant
Ignite IPA Pvt Ltd

------------------------------
Satish Gunturi
------------------------------

Joshna_16
Level 4
Hi,

Can you try below code in Code stage:

Collection_Out=Collection_In.DefaultView.ToTable(True)

Input : Collection_In
Output: Collection_Out

Language : Visual Basic
No additional namespaces required

------------------------------
Joshna Dammala
Project Engineer
Asia/Kolkata
------------------------------

I think, DefaultView.ToTable(True) will not work when using a collection inside a collection (a nested collection).

------------------------------
Jörg Kalkmann
------------------------------