Collection Unique Values

TomMason · ‎01-12-21

Hi All,

Hope you're well....

Hoping you can help with this one..... I'm working on extracting information from a Table within a product called Northgate Revs & Bens.

I have to use UIA Mode and am using the Get Rows functionality to extract the info into a Collection. This all works fine.... except for the last column.
This seems to be duplicating the information held within it but the info is actually only present once within the table. I need a method of just getting one instance of the information held within this column of the table...

E.g.

Currently on version 6.10

Kindest Regards,

Tom

devneetmohanty07 · ‎01-12-21

Hi Tom,

A similar issue happened to another community member yesterday. You can use the below solution that I mentioned on that post:

Extend the Collection Manipulation VBO by adding an action and pass one collection as an input parameter along with a text data item parameter called "Field Name" and pass another collection as an output parameter and use the below code in the code stage:

Output_Collection = Input_Collection.DefaultView.ToTable(True,Field_Name)

You can publish the action and pass the collection that you get from the UI and the name of the column which you want to use for finding unique values into this action by using the action stage from Process Studio. Please let me know if it solves your query.

---------------------------------------------------------------------------------------------------------------------------------------
Hope this helps you out and if so, please mark the current thread as the 'Answer', so others can refer to the same for reference in future.
Regards,
Devneet Mohanty,
SS&C Blueprism Community MVP 2024,
Automation Architect,
Wonderbotz India Pvt. Ltd.

TomMason · ‎01-12-21

Hi Devneet,

Many thanks for your reply....

I've just built the Action as suggested within the Collection action.... It extracts the required column into the New Output collection but the values are still duplicated....

Kindest Regards,

Tom

devneetmohanty07 · ‎01-12-21

Hi Tom,

When I see the collection that you have shared, it seems that technically all the values present are actually unique. If you observe carefully each of the row have different values. What I think you are trying to achieve is perhaps to get individual named occupants in a collection such as:

----------------------
Named Occupants
----------------------
Catherine Jane De Roos
----------------------
Miss Ashley Schofield
----------------------
Miss Kira Schofield

and so on.... it means that unique values are actually present within each row of this collection

If this is what you want to obtain then perhaps you would need to iterate through your collection and split your strings with respect to delimiters such as '&' and '<d>' characters and then once they have been splitted you can check if they are available in a final collection with the same field structure or not. If they are not available in that collection, you can add a row in the final collection and add that value or skip to the next delimited value.

In order to Split the values, you can use the 'Split Text' action from 'Utility - Strings' VBO and in order to check if a value exists in any given collection, you can use 'Collection Contains Value' action from 'Utility - Collection Manipulation' VBO.

If I can replicate the data at my end or if you provide me some sample data like this in form of the format I showed aboved, I can try a solution at my end as well.

---------------------------------------------------------------------------------------------------------------------------------------
Hope this helps you out and if so, please mark the current thread as the 'Answer', so others can refer to the same for reference in future.
Regards,
Devneet Mohanty,
SS&C Blueprism Community MVP 2024,
Automation Architect,
Wonderbotz India Pvt. Ltd.

TomMason · ‎01-12-21

Hi Devneet,

Thank you for your reply..... very helpful however still doesn't get what I need....

Basically, I just need my initial Collection to have 1 instance of the string of text within the Named Occupants column. So looking at my screen prints you can see that the first row has

Catherine Jane De Roos & Miss Ashley Schofield Catherine Jane De Roos & Miss Ashley Schofield

Within it.....

I just need that column within the Collection to have:

Catherine Jane De Roos & Miss Ashley Schofield

To make things more interesting then sometimes there is a & or <d> but not always, so there isn't always an Identifier to split the text by.

Its a head wrecker hahaha

Kind Regards,

Tom

devneetmohanty07 · ‎02-12-21

Hey Tom,

Sorry couldn't reply you sooner as I got stuck with some other work at my end. However, I went through the data you provided and came up with a solution based on that. However, it's bit complex workflow though I will try to explain to my best. First let's break our use case into two parts.

1) In the data there are few delimiters which may or may not occur which we are currently identifying as '&' and '<d>'. (This is comparatively easy to play around !!)

2) Again, we see some repeating names as well which are separated by spaces which is bit complicated to handle if no unique instances can be found before hand which means if let say I have a value called 'Catherine Jane De Roos Miss Ashley Schofield'. Now in order for the solution to work we must have a unique row before hand saying either 'Catherine Jane De Roos' or 'Miss Ashley Schofield' so that we can go ahead and get both of these values separately. If these unique values are not there in any of the pre-existing rows then for us to segregate these two names is practically not possible since we logically can't separate a name via spaces or determine the actual name by just looking at it. Some kind of a lookup needs to be in place before hand. (This one practically depends on the data we get, but the catch is if we can get unique values by splitting the data as we see in case 1, first then we can surely derive some logic to separate the names!)

So first I will show you the result of the solution that I got:

Inputs:

Outputs:

Pretty much what we want to see right?

Now diving into the solution, the first half of the solution looks something like below:

1) Here, I first loop through the input collection where I got a single column of the Named Occupants with me, in my case I call it 'Output Collection'. Please don't go by the name of the collection and consider it as the input of this solution only.

2) Within the loop I first use the 'Split Text' action and provide the input as the input collection name, the field name I want to split and '&' character as the character to be considered for splitting.

3) I store the value in 'Split Values' collection. Next, I check if the row count of the collection is greater than 1, if yes then loop through the 'Split Values' collection and check if the current named occupant exist in the 'Temp Collection'.

4) This 'Temp Collection' we would create at the design time having the same field structure as our input collection ('Named Occupants' field of text type) with no rows or data. If the current Split Value exists in the 'Temp Collection' skip the iteration and check for the next value in 'Split Collection' otherwise we add the row to 'Temp Collection' and set the current value of 'Split Values' collection to the added row field of the 'Temp Collection'. Once the iteration in 'Split Values' complete we proceed to the next iteration for the 'Output Collection'.

5) If the row count of 'Split Collection' for '&' operator was not greater than 1, we would move with the next Split character '<d>' and repeat the same steps as above.

All these steps will continue till we process for all rows in the Output Collection and finally we would call the 'Keep Unique Values In Collection' action which I shared with you at the very beginning of this thread to remove any duplicates found.

Now you get something like this in your 'Temp Collection';

So you can see we almost got it, however due to the second issue that I mentioned at the starting of this post there are two rows with unwanted name of another row in this collection.

In order to get rid of the second problem, now we would apply the second part of the solution: as shown below:

6) In this part of the solution, I would iterate through the 'Temp Collection' and first get the count of our output collection for the solution which is 'Final Collection'. This collection also will have the same schema as the 'Temp Collection' ('Named Occupants' field of text type) but will not have any rows or data. If the count is zero, it means that it is the very first record we are checking for 'Final Collection' so simply add a row to this collection and set the 'Named Occupant' field value of 'Temp Collection' to the newly added row of the 'Final Collection' and we proceed with the next value of the 'Temp Collection'

7) Now if the row count of the 'Final Collection' is not zero (From second iteration of 'Temp Collection' we will always have this condition as True), we will iterate through the 'Final Collection' and will retrieve the length of the current 'Named Occupant' field from the 'Final Collection' and 'Temp Collection'.

😎 Now we will be first checking which of the length is greater among the two so that we can decide how to use the InStr function to check the presence of the 'Named Occupant' field in either of the collection. If the length of 'Named Occupant' field in 'Final Collection' (I have mentioned this as 'Inner Occupant' in my code) is greater than the length of 'Named Occupant' field in 'Temp Collection' (I have mentioned this as 'Outer Occupant' in my code) then we would check if the Inner Occupant is included within the Outer Occupant by using the following expression: InStr([Final Collection.Named Occupant],[Temp Collection.Named Occupant]) > 0 If this expression is True then we can say that the Inner Occupant is included within the Outer Occupant and we would remove the Outer Occupant from the Inner Occupant using the following expression: Replace([Final Collection.Named Occupant],[Temp Collection.Named Occupant], "") and then add a new row to the 'Final Collection' and set the replaced value to the 'Final Collection'. We would then remove the empty rows from the 'Final Collection' and proceed with the next iteration in the 'Temp Collection'. In case above expression was False, then we would skip the current iteration for 'Final Collection' and move with the next row in the 'Final Collection'.

9) If the length of 'Named Occupant' field in 'Final Collection' is not greater than the length of 'Named Occupant' field in 'Temp Collection' as mentioned in Step 8, then we would check if the Outer Occupant is included within the Inner Occupant by using the following expression: InStr([Temp Collection.Named Occupant],[Final Collection.Named Occupant]) > 0 If this expression is True then we can say that the Outer Occupant is included within the Inner Occupant and we would remove the Inner Occupant from the Outer Occupant using the following expression: Replace([Temp Collection.Named Occupant],[Inner Collection.Named Occupant], "") and then add a new row to the 'Final Collection' and set the replaced value to the 'Final Collection'. We would then remove the empty rows from the 'Final Collection' and proceed with the next iteration in the 'Temp Collection'. In case above expression was False, then we would skip the current iteration for 'Final Collection' and move with the next row in the 'Final Collection'.

The variables that are used in the solution are:

I hope my explanation was clear as I can't share the bprelease file directly, though please take your time and go through it. Do let me know if you have any questions. Would be glad to help out!

---------------------------------------------------------------------------------------------------------------------------------------
Hope this helps you out and if so, please mark the current thread as the 'Answer', so others can refer to the same for reference in future.
Regards,
Devneet Mohanty,
SS&C Blueprism Community MVP 2024,
Automation Architect,
Wonderbotz India Pvt. Ltd.

SS&C Blue Prism Community

Collection Unique Values