cancel
Showing results for 
Search instead for 
Did you mean: 

How to read names of pdf's in a folder ?

SwatiAgrawal
Level 5
Hi,

I need the bot to do following -
1. Go to folder location.
2. Open folder
3. Read names of documents/pdf's that are placed in a folder.
4. Store names of documents/pdf's in a collection.

Is that possible. Please suggest how can this be achieved ?
Placing the screenshot below for reference. I need to retrieve names of documents & place in collection.
18412.png
Thanks in advance!!

------------------------------
Swati Agrawal
------------------------------
1 BEST ANSWER

Best Answers

Hi Swati,

The issue is in in this particular screenshot that you've provided:

18390.png
Here you are giving the 'Exact Value' parameter as "Filterout zero value accounts.Account". This will be read as an entire text and not as a data item by Blue Prism, instead of this try to provide this value as [Filterout zero value accounts.Account] where I have removed the double quotes and an enclosing it within square brackets as it it not a text but a data item stage holding the text. This way you are referencing the 'Account' field inside the 'Filterout zero value accounts collection' which you ideally want to do.

------------------------------
----------------------------------
Hope it helps you and if it resolves you query please mark it as the best answer so that others having the same problem can track the answer easily

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Sr. Consultant - Automation Developer,
Wonderbotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------
------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please provide a big thumbs up so that the others members in the community having similar problem statement can track the answer easily in future.

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Technical Business Analyst,
WonderBotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------

View answer in original post

11 REPLIES 11

Hi Swati

You can use the utility - File management object in Blue prism, just create an action referencing this object and select the action "Get Files". Just put in the folder you are wanting to look in and the pattern will, in this case, be "*.pdf". You can also add in additional patterns and look for more file extension types if required e.g. "*.pdf, *.xlsx, *.csv"

Hope this helps 🙂

------------------------------
Michael ONeil
Technical Lead developer
NTTData
Europe/London
------------------------------

Hi Swati,

You can use the 'Get Files' action in the 'Utility - File Management' VBO that will be present under the VBO folder of your Blue Prism installation file path which you can import. For this action, you need to provide the folder path where the files are available and you can provide a pattern like: *.pdf to make sure that only PDF files are picked up. Please refer to the below screenshots:


18289.png

18290.png

------------------------------
----------------------------------
Hope it helps you and if it resolves you query please mark it as the best answer so that others having the same problem can track the answer easily

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Sr. Consultant - Automation Developer,
Wonderbotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------
------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please provide a big thumbs up so that the others members in the community having similar problem statement can track the answer easily in future.

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Technical Business Analyst,
WonderBotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------

Thanks @devneetmohanty07 this worked. Now as per your screenshot, from this collection, how can I pull all the values of "Names" column  into a new collection ?
After this above mentioned action, I need to pull only  document names into a separate collection.

------------------------------
Swati Agrawal
------------------------------

Hi Swati,

Yes you can generate a collection with just the name column in couple of ways.

Approach-I:

The easiest way is you can create a defined collection let's call it Results and have only the Name column of type 'Text'. Then you can use the Loop stage to iterate over the collection you got from the 'Get Files' action and within the loop stage you use 'Add Rows' action from 'Internal - Collection' VBO to a add new row to the defined collection, 'Results' and then use a calculation stage to set the value of the Name column of the collection you received from the action to Results.Name. You can see the below screenshots:

18299.png
18300.png
Approach II:

The other way is to use the 'Split Collection' action from 'Utility - Collection Manipulation' VBO. However, there are few pre-requisites to use this. First, you need to have two template collections which should be predefined in such a way that each one of them must have a single blank row. Secondly, the field definitions of those collections should be such that one template has the necessary field which you want that is in you case the 'Name' field while the other template collection has the rest of the fields which are not needed as shown below:

18302.png
Here, if you check the overall fields of these two collections they make up the entire field list of our 'Files' collection which we got from 'Get Files' action that we want to split which is shown below:

18304.png
Once these collections are in place, you need to create two undefine output collections out of which one will hold the field values that we want, let's name it as 'Results' while the other collection will hold the rest of the value that we don't want, let's call that 'Others'. Set up the parameter values for the 'Split Collection' action as shown below:

18305.pngNow we can run the workflow and get the desired result in 'Results' collection while the other values are in 'Others' collection:

18306.png


------------------------------
----------------------------------
Hope it helps you and if it resolves you query please mark it as the best answer so that others having the same problem can track the answer easily

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Sr. Consultant - Automation Developer,
Wonderbotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------
------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please provide a big thumbs up so that the others members in the community having similar problem statement can track the answer easily in future.

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Technical Business Analyst,
WonderBotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------

@devneetmohanty07 suggestion will work to get only the names of the files you are wanting but do you need to move the names to another collection? Unless you need these for a very specific reason I would say save yourself some time and just use the files collection for whatever you need. For example if in the list of files you are looking for a specific file then you can just do a loop of the files collection and reference the column, [Files.Name] and in the loop you can add a decision for the criteria such as InStr([Files.Name], "fileNameIwant")>0 and this will let you select the file you are looking for.​​

------------------------------
Michael ONeil
Technical Lead developer
NTTData
Europe/London
------------------------------

@devneetmohanty07 @Michael ONeil Thanks!! Both of your approaches helped me solve what I was looking for.
Please also help me how I can remove duplicates from a column in collection. Eg: Names column has some duplicate rows. So how can I remove them ?​​

------------------------------
Swati Agrawal
------------------------------

@Swati Agrawal, you can create another collection called as 'Unique Values' with one column called 'Name' of 'Text' type. After the above mentioned solution by me, you can use a loop stage to iterate over 'Results' collection where you have the list of the names including duplicates.

Now inside the loop stage first use an action called 'Collection Contains Value' from 'Utility - Collection Manipulation' and give the input collection as 'Unique Results' , field name as 'Name' and value as [Results.Name] and store the flag result.

This action will tell you is there the current row item value of the Results collection in Unique Results collection or not.

If the value is present attach the link to the Loop End stage so that you skip the loop otherwise, use 'Add Rows' action to add a row to 'Unique Results' collection and then set the value of Unique Results.Name as [Results.Name] using a calculation stage.​

Please refer the below screenshot:

18325.png

18326.png

18327.png

------------------------------
----------------------------------
Hope it helps you and if it resolves you query please mark it as the best answer so that others having the same problem can track the answer easily

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Sr. Consultant - Automation Developer,
Wonderbotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------
------------------------------
----------------------------------
Hope it helps you out and if my solution resolves your query, then please provide a big thumbs up so that the others members in the community having similar problem statement can track the answer easily in future.

Regards,
Devneet Mohanty
Intelligent Process Automation Consultant | Technical Business Analyst,
WonderBotz India Pvt. Ltd.
Blue Prism Community MVP | Blue Prism 7x Certified Professional
Website: https://devneet.github.io/
Email: devneetmohanty07@gmail.com

----------------------------------

Hi Swati

The Remove duplicates action should do what you are wanting. I cant remember if this is an out the box action or it was something I created or just amended but either way I've included the code below. As relating to your earlier query I remembered I created an action to remove multiple columns from a collection which might simplify your process. I've included the code for this at the bottom as well.

Remove Duplicates
Start parameters - Input Collection, Include only unique values
Outputs - Output Collection, Success, Message
----------------------------------------------------------------------
Try
Collection_Out = Collection_In.DefaultView.ToTable(Only_unique_values_of_columns)
Success = True
 
Catch ex as Exception
Error_Message = ex.ToString()
Success = False
 
End Try
18341.png
18342.png

Delete Multiple Columns
Input parameters - Input collection, Column Names (e.g. "Column1, Column2, Column3")
Dim aPatterns As String()
 
BP_Collection_Out = BP_Collection_In
 
Try
 
Column_Names = Column_Names.replace("\,", "?")
aPatterns = Column_Names.split(",")
 
For each sPattern As String in aPatterns
sPattern = sPattern.replace("?", ",")
BP_Collection_Out.Columns.Remove(sPattern.Trim)
Next
 
BP_Collection_Out = BP_Collection_Out
Success = True
Message = ""
Catch e As Exception
Success = False
Message = e.Message
End Try

18343.png18344.png



------------------------------
Michael ONeil
Technical Lead developer
NTTData
Europe/London
------------------------------

Hi @devneetmohanty07,

This solution is not giving me unique values. It doesn't remove the duplicates.​ The "Results" collection is an output collection.
I will show you through screenshots -

This is the results collection which has duplicate Accounts-

18355.png
The flow -
18356.png
The unique collection which needs to remove duplicates -
18357.png
The field of Utility collection manipulation - 

18358.png
18359.png
This is the calculation stage values -
18360.png

However, when I run this bot - I get same duplicate values a earlier. Pls see below -
18361.png
Not sure if I am going wrong somewhere.

------------------------------
Swati Agrawal
------------------------------