25-05-22 01:50 PM
I have a record in excel and I would like to perform a delimited action. For example, I have column A with JOHN DOE ALLI I would like to remove the DOE ALLI from column A using the space as the separator. My end result in column A should just be JOHN.
Answered! Go to Answer.
27-05-22 02:52 AM
Hi,
If you have a large set of data and that can increase in even near future you can go with an OLEDB approach to make use of your excel file as a database. My advice would be to use the "Data - OLEDB" VBO which will read your excel file as a database using the ODBC drivers. For this VBO to work, you need to install the Microsoft Access Database Engine which you can get from Microsoft's website: Microsoft Access Database Engine.
So now I have created the below workflow which is as follows:
There are basically three parts of the code since if I just use the OLEDB part which is the second part of the code, I can get your splitted name in one column but in a different sheet altogether which needs to be available by default. So, there are three parts indicated by each grey block area in the above workflow.
Part I Of Code: Create Filtered Name Worksheet:
As you can see below is my input file:
Now, what I am doing in the first part of the code is to create a replicant worksheet with just a different sheet name and a header cell at "A1" with the value as "Filtered Name" as shown below:
So post executing this section, I get the below output:
NOTE: The reason why we went with the Part I of the code is because the OLEDB we are going to use will need a separate worksheet with this header to be available by default in order to work and the splitted value will always be generated in this separate sheet. If you want to have everything in your original sheet, that is where the Part III of the code will come into picture.
Part II Of Code: Create Column With First Splitted Value:
Here, I have specifically use the "Data - OLEDB" VBO, since it can work on thousands of records seamlessly and is always a recommended way to go for large datasets.
Now, in order to use it you would require a connection string. The connection string I am using is as follows and you can try with the same:
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=<FILE_PATH>;Extended Properties="Excel 12.0;HDR=YES;"
I replace the highlighted string with the excel file path that I have to get a value like this:Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Users\DevneetMohanty\Documents\Blue Prism Use Cases\Split Book.xlsx;Extended Properties="Excel 12.0;HDR=YES;"
Now this is my final connection string, from here onwards first I use the action, 'Set Connection' from the 'Data-OLEDB' business object and pass this connection string there:
Next, I use 'Open' action from the 'Data-OLEDB' business object to open the session and then I create query in order to split the column called 'Name' from 'Names' worksheet and put the first splitted value in the 'Filtered Name' column that we created in part I of the code in the 'Filtered Names' worksheet. The query looks as below:
INSERT INTO [Filtered Names$]([Filtered Name]) SELECT (MID([Name],1,INSTR([Name]," ")-1)) FROM [Names$A:A]
To generate this query I used a calculation stage and used the below expression to store it inside a data item:
NOTE: Here I have used the expression [Name$A:A] since I have my limiting my data in the Name worksheet within 'A' column of excel only. If you have the value at some other column replace 'A' with that column index value.
Post this I use the 'Execute' action and pass the query data item inside it to execute the query and split the data as shown below:
Next, I use 'Close' action from the 'Data-OLEDB' business object to close the connection to the file and now I get the below output:
So now if you are happy with this output in a separate sheet then part III of the code is not required but if you want it to be in the same sheet for some reason then I have created the part III of the code for that purpose.
Part III Of Code: Merge Results In Original Sheet:
So this part of the workflow simply opens up the file, copies the worksheet range from 'Filtered Names' sheet that we created to a column range in my original worksheet which was called 'Names' and then deletes the 'Filtered Names' worksheet so that we only have the original worksheet at the end of the day with us.
The 'Copy And Paste Range' action has the below parameters where I am basically copying the entire results that we got to column 'B' of my original worksheet:
So after the running the last part of the code, you will end up with the below results:
So, this workflow although a pretty long one has the benefits of not having any memory issues as we are not using collections at all and secondly since ODBC drivers are being used the speed also is good as for me it took 10 seconds to process 6500 records which I have in my input file. Lastly, tomorrow even if you get 60,000 or more records this solution can work even in those scenarios.
Alternatives Apart From This Solution:
1) Other approaches also can be used like of power queries if you are familiar with them where I mentioned how to use them with a document file which you can follow for the same and you can use it to trigger from Blue Prism: Working With Power Queries as they are faster even than OLEDB approach and yes they definitely can work with very large datasets.
2) Also, one approach can be to use the excel macros but they have restrictions at times in some organizations so I did not recommend that solution. But if you go via that route that also can give you results but you need to build it in such a way that Blue Prism enters the file path and column which you want to split in one sheet and then taking those as a parameter your macro would do the work once Blue Prism calls the macro. However, that needs some VBA knowledge and it won't work that nicely with a very large dataset.
3) Next approach as you figured out can be to use native excel VBO actions but they have their limitations when it comes to a very large dataset and you may not be able to use them at all. Hence, I did not recommend them. But, yes using those also I can tell you a solution which can give you the desired results either using excel actions or by extending them to create a custom action with a code stage.
All these above mentioned approaches will never require any collections and will help you to work with any dataset given any degree of complexity to it avoiding memory issues with Blue Prism altogether.
------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please mark it as the 'Best Answer' so that the others members in the community having similar problem statement can track the answer easily in future
Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Sr. Consultant - Automation Developer,
Wonderbotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com
----------------------------------
------------------------------
25-05-22 02:53 PM
Trim(Left([Data.Name], InStr([Data.Name], " ")))
Now, you can have an output collection with the same schema as that of your input collection, 'Data' called as 'Result' as shown below:
Now, you use the 'Add Row' action from 'Internal - Collections' object and pass the input argument as the collection name of 'Result' as shown below:
Now use a calculation stage to set the value of 'Result.Name' as [First Set] data item as shown below:
At the end this is how your workflow should look like along with the output data:
Results:
NOTE: This workflow will be applicable after you have read the data from excel into a collection
------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please mark it as the 'Best Answer' so that the others members in the community having similar problem statement can track the answer easily in future
Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Sr. Consultant - Automation Developer,
Wonderbotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com
----------------------------------
------------------------------
26-05-22 01:49 PM
26-05-22 02:31 PM
26-05-22 04:17 PM
27-05-22 02:52 AM
Hi,
If you have a large set of data and that can increase in even near future you can go with an OLEDB approach to make use of your excel file as a database. My advice would be to use the "Data - OLEDB" VBO which will read your excel file as a database using the ODBC drivers. For this VBO to work, you need to install the Microsoft Access Database Engine which you can get from Microsoft's website: Microsoft Access Database Engine.
So now I have created the below workflow which is as follows:
There are basically three parts of the code since if I just use the OLEDB part which is the second part of the code, I can get your splitted name in one column but in a different sheet altogether which needs to be available by default. So, there are three parts indicated by each grey block area in the above workflow.
Part I Of Code: Create Filtered Name Worksheet:
As you can see below is my input file:
Now, what I am doing in the first part of the code is to create a replicant worksheet with just a different sheet name and a header cell at "A1" with the value as "Filtered Name" as shown below:
So post executing this section, I get the below output:
NOTE: The reason why we went with the Part I of the code is because the OLEDB we are going to use will need a separate worksheet with this header to be available by default in order to work and the splitted value will always be generated in this separate sheet. If you want to have everything in your original sheet, that is where the Part III of the code will come into picture.
Part II Of Code: Create Column With First Splitted Value:
Here, I have specifically use the "Data - OLEDB" VBO, since it can work on thousands of records seamlessly and is always a recommended way to go for large datasets.
Now, in order to use it you would require a connection string. The connection string I am using is as follows and you can try with the same:
Provider=Microsoft.ACE.OLEDB.12.0;Data Source=<FILE_PATH>;Extended Properties="Excel 12.0;HDR=YES;"
I replace the highlighted string with the excel file path that I have to get a value like this:Provider=Microsoft.ACE.OLEDB.12.0;Data Source=C:\Users\DevneetMohanty\Documents\Blue Prism Use Cases\Split Book.xlsx;Extended Properties="Excel 12.0;HDR=YES;"
Now this is my final connection string, from here onwards first I use the action, 'Set Connection' from the 'Data-OLEDB' business object and pass this connection string there:
Next, I use 'Open' action from the 'Data-OLEDB' business object to open the session and then I create query in order to split the column called 'Name' from 'Names' worksheet and put the first splitted value in the 'Filtered Name' column that we created in part I of the code in the 'Filtered Names' worksheet. The query looks as below:
INSERT INTO [Filtered Names$]([Filtered Name]) SELECT (MID([Name],1,INSTR([Name]," ")-1)) FROM [Names$A:A]
To generate this query I used a calculation stage and used the below expression to store it inside a data item:
NOTE: Here I have used the expression [Name$A:A] since I have my limiting my data in the Name worksheet within 'A' column of excel only. If you have the value at some other column replace 'A' with that column index value.
Post this I use the 'Execute' action and pass the query data item inside it to execute the query and split the data as shown below:
Next, I use 'Close' action from the 'Data-OLEDB' business object to close the connection to the file and now I get the below output:
So now if you are happy with this output in a separate sheet then part III of the code is not required but if you want it to be in the same sheet for some reason then I have created the part III of the code for that purpose.
Part III Of Code: Merge Results In Original Sheet:
So this part of the workflow simply opens up the file, copies the worksheet range from 'Filtered Names' sheet that we created to a column range in my original worksheet which was called 'Names' and then deletes the 'Filtered Names' worksheet so that we only have the original worksheet at the end of the day with us.
The 'Copy And Paste Range' action has the below parameters where I am basically copying the entire results that we got to column 'B' of my original worksheet:
So after the running the last part of the code, you will end up with the below results:
So, this workflow although a pretty long one has the benefits of not having any memory issues as we are not using collections at all and secondly since ODBC drivers are being used the speed also is good as for me it took 10 seconds to process 6500 records which I have in my input file. Lastly, tomorrow even if you get 60,000 or more records this solution can work even in those scenarios.
Alternatives Apart From This Solution:
1) Other approaches also can be used like of power queries if you are familiar with them where I mentioned how to use them with a document file which you can follow for the same and you can use it to trigger from Blue Prism: Working With Power Queries as they are faster even than OLEDB approach and yes they definitely can work with very large datasets.
2) Also, one approach can be to use the excel macros but they have restrictions at times in some organizations so I did not recommend that solution. But if you go via that route that also can give you results but you need to build it in such a way that Blue Prism enters the file path and column which you want to split in one sheet and then taking those as a parameter your macro would do the work once Blue Prism calls the macro. However, that needs some VBA knowledge and it won't work that nicely with a very large dataset.
3) Next approach as you figured out can be to use native excel VBO actions but they have their limitations when it comes to a very large dataset and you may not be able to use them at all. Hence, I did not recommend them. But, yes using those also I can tell you a solution which can give you the desired results either using excel actions or by extending them to create a custom action with a code stage.
All these above mentioned approaches will never require any collections and will help you to work with any dataset given any degree of complexity to it avoiding memory issues with Blue Prism altogether.
------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please mark it as the 'Best Answer' so that the others members in the community having similar problem statement can track the answer easily in future
Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Sr. Consultant - Automation Developer,
Wonderbotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com
----------------------------------
------------------------------
27-05-22 09:33 AM