cancel
Showing results for 
Search instead for 
Did you mean: 

performing a Delimited Action on Excel

OyinkansolaMusa
Level 3
Hello,

I really need help solving this and I have spent 2 weeks trying to figure this out.

I have a record in excel and I would like to perform a delimited action. For example, I have column A with JOHN DOE ALLI I would like to remove the DOE ALLI from column A using the space as the separator. My end result in column A should just be JOHN.

1 BEST ANSWER

Helpful Answers

Hi,

If you have a large set of data and that can increase in even near future you can go with an OLEDB approach to make use of your excel file as a database. My advice would be to use the "Data - OLEDB" VBO which will read your excel file as a database using the ODBC drivers. For this VBO to work, you need to install the Microsoft Access Database Engine which you can get from Microsoft's website: Microsoft Access Database Engine.

So now I have created the below workflow which is as follows:

36953.png

There are basically three parts of the code since if I just use the OLEDB part which is the second part of the code, I can get your splitted name in one column but in a different sheet altogether which needs to be available by default. So, there are three parts indicated by each grey block area in the above workflow.

Part I Of Code: Create Filtered Name Worksheet:

As you can see below is my input file:

36954.png
Now, what I am doing in the first part of the code is to create a replicant worksheet with just a different sheet name and a header cell at "A1" with the value as "Filtered Name" as shown below:

36955.png
So post executing this section, I get the below output:

36956.png

NOTE: The reason why we went with the Part I of the code is because the OLEDB we are going to use will need a separate worksheet with this header to be available by default in order to work and the splitted value will always be generated in this separate sheet. If you want to have everything in your original sheet, that is where the Part III of the code will come into picture.

Part II Of Code: Create Column With First Splitted Value:

Here, I have specifically use the "Data - OLEDB" VBO, since it can work on thousands of records seamlessly and is always a recommended way to go for large datasets.

36957.png

Now, in order to use it you would require a connection string. The connection string I am using is as follows and you can try with the same:

  Provider=Microsoft.ACE.OLEDB.12.0;Data Source=<FILE_PATH>;Extended Properties="Excel 12.0;HDR=YES;"

I replace the highlighted string with the excel file path that I have to get a value like this:

Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Users\DevneetMohanty\Documents\Blue Prism Use Cases\Split Book.xlsx;Extended Properties="Excel 12.0;HDR=YES;"

Now this is my final connection string, from here onwards first I use the action, 'Set Connection' from the 'Data-OLEDB' business object and pass this connection string there:

36958.png

Next, I use 'Open' action from the 'Data-OLEDB' business object to open the session and then I create query in order to split the column called 'Name' from 'Names' worksheet and put the first splitted value in the 'Filtered Name' column that we created in part I of the code in the 'Filtered Names' worksheet. The query looks as below:

INSERT INTO [Filtered Names$]([Filtered Name]) SELECT (MID([Name],1,INSTR([Name]," ")-1)) FROM [Names$A:A]


To generate this query I used a calculation stage and used the below expression to store it inside a data item:

36959.png

NOTE: Here I have used the expression [Name$A:A] since I have my limiting my data in the Name worksheet within 'A' column of excel only. If you have the value at some other column replace 'A' with that column index value.

Post this I use the 'Execute' action and pass the query data item inside it to execute the query and split the data as shown below:

36960.png

Next, I use 'Close' action from the 'Data-OLEDB' business object to close the connection to the file and now I get the below output:



36961.png

So now if you are happy with this output in a separate sheet then part III of the code is not required but if you want it to be in the same sheet for some reason then I have created the part III of the code for that purpose.

Part III Of Code: Merge Results In Original Sheet:

So this part of the workflow simply opens up the file, copies the worksheet range from 'Filtered Names' sheet that we created to a column range in my original worksheet which was called 'Names' and then deletes the 'Filtered Names' worksheet so that we only have the original worksheet at the end of the day with us.

The 'Copy And Paste Range' action has the below parameters where I am basically copying the entire results that we got to column 'B' of my original worksheet:

36962.png
36963.png

So after the running the last part of the code, you will end up with the below results:

36964.png

So, this workflow although a pretty long one has the benefits of not having any memory issues as we are not using collections at all and secondly since ODBC drivers are being used the speed also is good as for me it took 10 seconds to process 6500 records which I have in my input file. Lastly, tomorrow even if you get 60,000 or more records this solution can work even in those scenarios.

Alternatives Apart From This Solution:

1) Other approaches also can be used like of power queries if you are familiar with them where I mentioned how to use them with a document file which you can follow for the same and you can use it to trigger from Blue Prism: Working With Power Queries as they are faster even than OLEDB approach and yes they definitely can work with very large datasets.

2) Also, one approach can be to use the excel macros but they have restrictions at times in some organizations so I did not recommend that solution. But if you go via that route that also can give you results but you need to build it in such a way that Blue Prism enters the file path and column which you want to split in one sheet and then taking those as a parameter your macro would do the work once Blue Prism calls the macro. However, that needs some VBA knowledge and it won't work that nicely with a very large dataset.

3) Next approach as you figured out can be to use native excel VBO actions but they have their limitations when it comes to a very large dataset and you may not be able to use them at all. Hence, I did not recommend them. But, yes using those also I can tell you a solution which can give you the desired results either using excel actions or by extending them to create a custom action with a code stage.

All these above mentioned approaches will never require any collections and will help you to work with any dataset given any degree of complexity to it avoiding memory issues with Blue Prism altogether.

------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please mark it as the 'Best Answer' so that the others members in the community having similar problem statement can track the answer easily in future

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Sr. Consultant - Automation Developer,
Wonderbotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------
------------------------------

---------------------------------------------------------------------------------------------------------------------------------------
Hope this helps you out and if so, please mark the current thread as the 'Answer', so others can refer to the same for reference in future.
Regards,
Devneet Mohanty,
SS&C Blueprism Community MVP 2024,
Automation Architect,
Wonderbotz India Pvt. Ltd.

View answer in original post

6 REPLIES 6

Hi Oyinkansola,

You can read the column A from the excel file into a collection using 'Get Worksheet Collection' action from 'MS Excel - VBO' business object. I am assuming the collection here has the column name as 'Name' and I call this collection as 'Data' as shown below:

36943.png
Currently, let say from your excel file the data got read as below:

36944.png
Then you can iterate over the collection using a Loop Stage and defining the collection name inside it as shown below:

36945.png
Then use a calculation stage with the following expression and store in a data item of type text called as 'First Set':

Trim(Left([Data.Name], InStr([Data.Name], " ")))​




36946.png


Now, you can have an output collection with the same schema as that of your input collection, 'Data' called as 'Result' as shown below:

36947.png

Now, you use the 'Add Row' action from 'Internal - Collections' object and pass the input argument as the collection name of 'Result' as shown below:

36948.png
Now use a calculation stage to set the value of 'Result.Name' as [First Set] data item as shown below:

36949.png

At the end this is how your workflow should look like along with the output data:

36950.png

Results:

36951.png

NOTE: This workflow will be applicable after you have read the data from excel into a collection

------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please mark it as the 'Best Answer' so that the others members in the community having similar problem statement can track the answer easily in future

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Sr. Consultant - Automation Developer,
Wonderbotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------
------------------------------

---------------------------------------------------------------------------------------------------------------------------------------
Hope this helps you out and if so, please mark the current thread as the 'Answer', so others can refer to the same for reference in future.
Regards,
Devneet Mohanty,
SS&C Blueprism Community MVP 2024,
Automation Architect,
Wonderbotz India Pvt. Ltd.

OyinkansolaMusa
Level 3
Hi Devneet,

This approach works fine for a small number of records however, the record I am working with is over 6000 so writing them to a collection might not be the best approach. Although I have gotten a vbo that works I guess it needs some little tweaks. The vbo performs the action but retains the not needed data in different columns.  
36952.png

Hi Oyinkansola,

Glad your issue got resolved. There are multiple ways to achieve the same thing but I would recommend next time please also mention your exact use case since that will determine the solution I can provide you. If you would have told me about having that many records I would have recommended you something different.
---------------------------------------------------------------------------------------------------------------------------------------
Hope this helps you out and if so, please mark the current thread as the 'Answer', so others can refer to the same for reference in future.
Regards,
Devneet Mohanty,
SS&C Blueprism Community MVP 2024,
Automation Architect,
Wonderbotz India Pvt. Ltd.

OyinkansolaMusa
Level 3
Hello,

I would like to know what you would have recommended maybe your approach will resolve the issue 100%. Currently, I am using the Text to Column VBO. It works but still retains the unwanted texts (DOE ALLI) in different columns. I don't want it to retain them I only need the JOHN

Hi,

If you have a large set of data and that can increase in even near future you can go with an OLEDB approach to make use of your excel file as a database. My advice would be to use the "Data - OLEDB" VBO which will read your excel file as a database using the ODBC drivers. For this VBO to work, you need to install the Microsoft Access Database Engine which you can get from Microsoft's website: Microsoft Access Database Engine.

So now I have created the below workflow which is as follows:

36953.png

There are basically three parts of the code since if I just use the OLEDB part which is the second part of the code, I can get your splitted name in one column but in a different sheet altogether which needs to be available by default. So, there are three parts indicated by each grey block area in the above workflow.

Part I Of Code: Create Filtered Name Worksheet:

As you can see below is my input file:

36954.png
Now, what I am doing in the first part of the code is to create a replicant worksheet with just a different sheet name and a header cell at "A1" with the value as "Filtered Name" as shown below:

36955.png
So post executing this section, I get the below output:

36956.png

NOTE: The reason why we went with the Part I of the code is because the OLEDB we are going to use will need a separate worksheet with this header to be available by default in order to work and the splitted value will always be generated in this separate sheet. If you want to have everything in your original sheet, that is where the Part III of the code will come into picture.

Part II Of Code: Create Column With First Splitted Value:

Here, I have specifically use the "Data - OLEDB" VBO, since it can work on thousands of records seamlessly and is always a recommended way to go for large datasets.

36957.png

Now, in order to use it you would require a connection string. The connection string I am using is as follows and you can try with the same:

  Provider=Microsoft.ACE.OLEDB.12.0;Data Source=<FILE_PATH>;Extended Properties="Excel 12.0;HDR=YES;"

I replace the highlighted string with the excel file path that I have to get a value like this:

Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Users\DevneetMohanty\Documents\Blue Prism Use Cases\Split Book.xlsx;Extended Properties="Excel 12.0;HDR=YES;"

Now this is my final connection string, from here onwards first I use the action, 'Set Connection' from the 'Data-OLEDB' business object and pass this connection string there:

36958.png

Next, I use 'Open' action from the 'Data-OLEDB' business object to open the session and then I create query in order to split the column called 'Name' from 'Names' worksheet and put the first splitted value in the 'Filtered Name' column that we created in part I of the code in the 'Filtered Names' worksheet. The query looks as below:

INSERT INTO [Filtered Names$]([Filtered Name]) SELECT (MID([Name],1,INSTR([Name]," ")-1)) FROM [Names$A:A]


To generate this query I used a calculation stage and used the below expression to store it inside a data item:

36959.png

NOTE: Here I have used the expression [Name$A:A] since I have my limiting my data in the Name worksheet within 'A' column of excel only. If you have the value at some other column replace 'A' with that column index value.

Post this I use the 'Execute' action and pass the query data item inside it to execute the query and split the data as shown below:

36960.png

Next, I use 'Close' action from the 'Data-OLEDB' business object to close the connection to the file and now I get the below output:



36961.png

So now if you are happy with this output in a separate sheet then part III of the code is not required but if you want it to be in the same sheet for some reason then I have created the part III of the code for that purpose.

Part III Of Code: Merge Results In Original Sheet:

So this part of the workflow simply opens up the file, copies the worksheet range from 'Filtered Names' sheet that we created to a column range in my original worksheet which was called 'Names' and then deletes the 'Filtered Names' worksheet so that we only have the original worksheet at the end of the day with us.

The 'Copy And Paste Range' action has the below parameters where I am basically copying the entire results that we got to column 'B' of my original worksheet:

36962.png
36963.png

So after the running the last part of the code, you will end up with the below results:

36964.png

So, this workflow although a pretty long one has the benefits of not having any memory issues as we are not using collections at all and secondly since ODBC drivers are being used the speed also is good as for me it took 10 seconds to process 6500 records which I have in my input file. Lastly, tomorrow even if you get 60,000 or more records this solution can work even in those scenarios.

Alternatives Apart From This Solution:

1) Other approaches also can be used like of power queries if you are familiar with them where I mentioned how to use them with a document file which you can follow for the same and you can use it to trigger from Blue Prism: Working With Power Queries as they are faster even than OLEDB approach and yes they definitely can work with very large datasets.

2) Also, one approach can be to use the excel macros but they have restrictions at times in some organizations so I did not recommend that solution. But if you go via that route that also can give you results but you need to build it in such a way that Blue Prism enters the file path and column which you want to split in one sheet and then taking those as a parameter your macro would do the work once Blue Prism calls the macro. However, that needs some VBA knowledge and it won't work that nicely with a very large dataset.

3) Next approach as you figured out can be to use native excel VBO actions but they have their limitations when it comes to a very large dataset and you may not be able to use them at all. Hence, I did not recommend them. But, yes using those also I can tell you a solution which can give you the desired results either using excel actions or by extending them to create a custom action with a code stage.

All these above mentioned approaches will never require any collections and will help you to work with any dataset given any degree of complexity to it avoiding memory issues with Blue Prism altogether.

------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please mark it as the 'Best Answer' so that the others members in the community having similar problem statement can track the answer easily in future

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Sr. Consultant - Automation Developer,
Wonderbotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------
------------------------------

---------------------------------------------------------------------------------------------------------------------------------------
Hope this helps you out and if so, please mark the current thread as the 'Answer', so others can refer to the same for reference in future.
Regards,
Devneet Mohanty,
SS&C Blueprism Community MVP 2024,
Automation Architect,
Wonderbotz India Pvt. Ltd.

OyinkansolaMusa
Level 3
Hello Devneet,

This was really helpful.