17-03-23 01:46 AM
Hi All,
Can anyone help me on custom code to remove duplicate rows from a collection based on 3 column names. The input collection has 6 columns in total. i wanted to remove duplicates based on 3 columns.
Thanks in Advance
Answered! Go to Answer.
17-03-23 10:23 AM
Hello Amrutha Sivarajan,
You can add an action in one of your extended objects for collection manipulations, make sure you have correct external references for Dll and their Namespaces listed in Initialise stage, please follow below approach:
System.GC.Collect()
'filteredCollection=RawData.clone()
'filteredCollection=RawData.DefaultView.ToTable(true, columnToCheck)
dim count as integer=rawData.rows.count-1
dim i as integer =0
dim rowIndex as integer
for  rowIndex = rawData.rows.count-2 to 0 step rowIndex-1
    if rawData.rows(rowIndex).item(columnToCheck1)=rawData.rows(rowIndex+1).item(columnToCheck1) and 
        rawData.rows(rowIndex).item(columnToCheck2)=rawData.rows(rowIndex+1).item(columnToCheck2) and 
        rawData.rows(rowIndex).item(columnToCheck3)=rawData.rows(rowIndex+1).item(columnToCheck3)
        rawData.rows(rowIndex+1).delete
    end if
next
filteredCollection=rawData.copy()
Let me know if you find any difficulties implementing this.
17-03-23 05:00 AM
Hi Amrutha
Can you try the below code
Dim dt As DataTable = [your DataTable]
Dim columnNames() As String = inputColumnName.Split(","c) 'replace with your input parameter that contains the column names
Dim distinctDT As DataTable = dt.DefaultView.ToTable(True, columnNames)
above code from chat gpt and also refer below thread for better understanding
Delete Repetitive Values in a collection or Excel | Blue Prism Product
Hope this helps
Regards
Lakshmi Narayana
17-03-23 10:13 AM
Hi Amrutha,
Avoidance of duplicates can be achieved in two ways:
While Lakshmi describes how to tackle the first method, you may also consider to have a look at the second method.
If your collection data was derived from an XL, you might consider extracting the data using OLEDB using an SQL DISTINCT statement that, if constructed correctly, will only populate the collection with unique data.
If you data comes from other sources, you might consider checking if a row to be added to the collection perhaps already exists there. And then not adding it if it appears to be a duplicate.
17-03-23 10:23 AM
Hello Amrutha Sivarajan,
You can add an action in one of your extended objects for collection manipulations, make sure you have correct external references for Dll and their Namespaces listed in Initialise stage, please follow below approach:
System.GC.Collect()
'filteredCollection=RawData.clone()
'filteredCollection=RawData.DefaultView.ToTable(true, columnToCheck)
dim count as integer=rawData.rows.count-1
dim i as integer =0
dim rowIndex as integer
for  rowIndex = rawData.rows.count-2 to 0 step rowIndex-1
    if rawData.rows(rowIndex).item(columnToCheck1)=rawData.rows(rowIndex+1).item(columnToCheck1) and 
        rawData.rows(rowIndex).item(columnToCheck2)=rawData.rows(rowIndex+1).item(columnToCheck2) and 
        rawData.rows(rowIndex).item(columnToCheck3)=rawData.rows(rowIndex+1).item(columnToCheck3)
        rawData.rows(rowIndex+1).delete
    end if
next
filteredCollection=rawData.copy()
Let me know if you find any difficulties implementing this.
18-03-23 04:17 AM
Hi Mukesh
Thanks for your solution
I have tested but their is thing that it is checking for consecutive rows I guess what if their is one row at index like 1 and same is repeated at row index 10
I have implemented and tested it but it is not able to remove duplicate rows which are not in consecutive positions
Can you please check if I am doing any wrong
Regards
18-03-23 04:33 AM
Hi Mukesh
Got the answer from your reply in other thread as first need to sort the collection with any one of the column in the three columns
Please ignore the above
Regards
18-03-23 06:43 AM
Glad you were able to sort that out ! There was a similar query but for one column value duplicate - Row removal.
The only thing that changes in the code is the AND condition operator for column/columns to check.
