cancel
Showing results for 
Search instead for 
Did you mean: 

Read required columns data from Excel

vinodchinthakin
Level 9
Hi All,
I Have a Excel file with sheet 1 containing columns A,B,C,D.............Y,Z
I want to read data of columns C,E,W,X from excel file. I dont want remaining data.
I tried get worksheet range as collection, but its working only for single column and for multiple its taking sequence as I provide start and end cells.
In my requirement I need random columns from excel, for this its throwing an error " Could not execute code stage because exception thrown by code stage: This action won't work on multiple selections."

Can some one say how to read excel file of required columns.
Note: I know we can use OLEDB connection with Queries. But I want to check with Excel VBO or any code stage.

Thanks in advance

------------------------------
vinod chinthakindi
------------------------------
10 REPLIES 10

EmersonF
MVP
Hi vinod, if it's always these columns you want, you can capture all the columns and run a loop removing all the columns you don't want.
Second solution, copy the columns you want to a temporary excel and then read the information from that temporary excel.
Third solution, use OLEDB.

Note: depending on the volume of data I do not suggest the loop

------------------------------
Emerson Ferreira
Sr Business Analyst
Avanade Brasil
Recife
+5581988869544
If my answer helped you? Mark as useful!
------------------------------
Sr Cons at Avanade Brazil

John__Carter
Staff
Staff
Another option is to delete the unwanted columns in Excel, read all remaining columns (CEWX) and close the file without saving changes.

------------------------------
John Carter
Professional Services
Blue Prism
------------------------------

Hi Vinod,

As per you requirement you can read the full data first from the excel file and once you retrieve the entire input collection, you can use this custom action that I have created called as 'Filter Columns'. For this to work, you can simply extend the 'Collection Manipulation' VBO and follow the below steps:

1) Add the namespace reference: 'System.Collections.Generic' in case if it is not there as shown below under the Code Options tab in the Page Description stage of your 'Initialise' action:

17297.png

2) Create the below action with the input parameters: Input Collection (Collection), Fields (Collection) and Keep Fields (Boolean) and an output parameter: Output Collection (Collection):

17298.png

NOTE: The Fields collection should always have just one field with the name as 'Field Name' and type as 'Text' for storing the field name for this action to work.

3) In the code stage, use the following input and output parameters as shown below along with the code:

17299.png

17300.png


Code:

17301.png17302.png


' Check if the field names supplied in the Fields collection exist in the supplied input collectionDataColumnsAvailable

Dim MissingFields As String = ""
Dim Field_List As New List(Of String)

Dim DataColumnsAvailable As DataColumnCollection

DataColumnsAvailable = Input_Collection.Columns

    For Each row In Fields.Rows

         If Not DataColumnsAvailable.Contains(row("Field Name").ToString()) Then

               If MissingFields.Equals("") Then

                   MissingFields = row("Field Name").ToString()

               Else

                   MissingFields = MissingFields & "," & row("Field Name").ToString()

               End If

         End If

        Field_List.Add(row("Field Name").ToString())

   Next


If Not MissingFields.Equals("") Then

   Throw New Exception("The following supplied field names do not exist in the input collection : " & MissingFields)

End If


' If the Keep Fields flag is True, filtering the input collection

If Keep_Fields = True Then

     Output_Collection = Input_Collection.DefaultView.ToTable(False,Field_List.ToArray())

Else

     ' If the Keep Fields flag is False, getting fields to be kept and filtering the input collection

     ' Get Fields to be Kept

      Dim Field_List_New As New List(Of String)

      For Each FieldName In DataColumnsAvailable

          If Not Field_List.Contains(FieldName.ColumnName) Then

               Field_List_New.Add(FieldName.ColumnName)

          End If

      Next

      Output_Collection = Input_Collection.DefaultView.ToTable(False,Field_List_New.ToArray())

End If


Test Results:


The input collection and fields has the following values:

17303.png

When the value of the Keep Fields data item is False, all the fields mentioned in the Fields collection will be removed from the Input Collection and will be assigned to the Output Collection:

17306.png
When the value of the Keep Fields data item is True, all the fields mentioned in the Fields collection will be kept in the Input Collection removing the rest of the fields and the value will be assigned to the Output Collection:

17307.png

So in conclusion, you can use the action in any of the scenario you want to perform the mode of filtering for the columns based on the Keep Fields flag value. For example, if the columns to be removed is more than the columns to be kept I would suggest to use Keep Fields flag as True and just define the column names that you want to keep in the Fields collection (easier to maintain and define) and if columns to be removed is less than the columns to be kept I would suggest to use Keep Fields flag as False and just define the column names that you want to remove in the Fields collection.

Let us know if that helps!!

------------------------------
----------------------------------
Hope it helps you and if it resolves you query please mark it as the best answer so that others having the same problem can track the answer easily

Regards,
Devneet Mohanty
Intelligent Automation Consultant
Blueprism 6x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------
------------------------------
---------------------------------------------------------------------------------------------------------------------------------------
Hope this helps you out and if so, please mark the current thread as the 'Answer', so others can refer to the same for reference in future.
Regards,
Devneet Mohanty,
SS&C Blueprism Community MVP 2024,
Automation Architect,
Wonderbotz India Pvt. Ltd.

Thanks @devneetmohanty07 for the solution. Your solution works for less volume excel data. But in my case Excel volume is very big (1-1.5 lakh rows with 40-50 columns). So instead of reading complete excel data, I want to read only required column data (I need 10-12 columns data only).

Thanks @John Carter. You solution worked, but I have to use loop to delete multiple columns using "Delete Columns" action, as action won't work on multiple selections. Do you know how to pass multiple columns while using Delete Column Action.

Hi @EmersonF , Solution 1 doesn't work for me because of huge volume of data & solution 2, I think it increases execution time. Can you tell me what is the action used to copy columns of an excel to another excel. If no other solution found then I will go with your Third solution i.e OLEDB

​​​​

------------------------------
vinod chinthakindi
------------------------------

Hi Vinod,

You should have mentioned that there were 1.5 lakh rows in your original post, otherwise my suggestion would have been to go with OLEDB only or something else. When working with any requirement where the excel data is large only ADO .NET or OLEDB are your best bet most of the times. Otherwise one very neat and efficient approach that I have used recently has been to use power queries. Create a template with Power Query parameters in one sheet.

Create a power query connection once you populate the parameters (Input File Path, Sheet Names, Column Names to filter) and then you can simply invoke a VB Script or create your own custom action to refresh the power query file. It works way faster than OLEDB as per my prior experience ( Real time analysis: 70000 records, OLEDB took some 5 min to process many business logics and Power Query took simply 6 seconds.)

------------------------------
----------------------------------
Hope it helps you and if it resolves you query please mark it as the best answer so that others having the same problem can track the answer easily

Regards,
Devneet Mohanty
Intelligent Automation Consultant
Blueprism 6x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------
------------------------------
---------------------------------------------------------------------------------------------------------------------------------------
Hope this helps you out and if so, please mark the current thread as the 'Answer', so others can refer to the same for reference in future.
Regards,
Devneet Mohanty,
SS&C Blueprism Community MVP 2024,
Automation Architect,
Wonderbotz India Pvt. Ltd.

HI Devneet Mohanty. Sorry I forgot to mention the no. of records in my original post. I had some issues with OLEDB. So looking for alternate solution of code or VBO.
I am not aware of Power Queries. Definitely I will look into them. If possible you can elaborate more about power queries, how to use in BP or share any KB articles, so that I can do work around.

Thanks

------------------------------
vinod chinthakindi
------------------------------

As per your use case, I have attached the document explaining all the steps required for Creating Sample Power Query file, Integrating Power Query with Blue Prism and testing the same.

On my machine, it took around 6 seconds on Debug Mode (Fast speed) to process 1.7 lakh records roughly with 11 columns. Hope it also resolves your queries as well. Let us know it this helps 🙂

P.S. Note the data I have used is publicly available on Kaggle

------------------------------
----------------------------------
Hope it helps you and if it resolves you query please mark it as the best answer so that others having the same problem can track the answer easily

Regards,
Devneet Mohanty
Intelligent Automation Consultant
Blueprism 6x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------
------------------------------
---------------------------------------------------------------------------------------------------------------------------------------
Hope this helps you out and if so, please mark the current thread as the 'Answer', so others can refer to the same for reference in future.
Regards,
Devneet Mohanty,
SS&C Blueprism Community MVP 2024,
Automation Architect,
Wonderbotz India Pvt. Ltd.

Thanks @devneetmohanty07 for providing document with detailed explanation. once I implement your solution, will provide a complete review on it.​

------------------------------
vinod chinthakindi
------------------------------

@vinod chinthakindi Unfortunately I can't explain it with prints because I'm on a new machine and I don't have blue prism installed yet, but I'll try to describe as much as I can, you'll need to create a single handle (instance) where you should open your data source (Excel where it contains the values to be copied) then you can create a new workbook XYZ, give an activate workbook to the data source and use the Select Range action, to select the column range eg "C:C" to select the whole column, then use the action copy, then use the activate workbook again to select the workbook XYZ and use the Past option, and repeat this process for each column you want to copy



------------------------------
Emerson Ferreira
Sr Business Analyst
Avanade Brasil
Recife
+5581988869544
If my answer helped you? Mark as useful!
------------------------------
Sr Cons at Avanade Brazil