cancel
Showing results for 
Search instead for 
Did you mean: 

PDF Data extraction

DavidTaliga
Level 2
Hi Everyone,  As I have been working recently on a project where I had to read data from different types of PDF documents.  I would like to ask if there is planned in future to create an Object in BP which will deal with PDF manipulation, or just some update which will enable better manipulation with PDF documents.   For now we have just only two possible options how to read data from PDF: 1. We can use just simple copy data with Global Send Keys  2. Use Surface Automation to read certain regions in PDF    I think that is not enough, there are reasons: 1. Copy Paste (Global Send Keys) - Data are pasted in different structure, not accordingly from top to bottom like in PDF,  so If we have document which has large amount of words, tables, etc it is almost impossible to catch (calculate) all needed data. It needs too much Effort to extract the correct data without hard coding in calculation stages, even if it is possible. 2. Surface Automation  - Surface automation is still not 100% working approach, customers usually try to avoid this solution and it can crash the process very easy. - Imagine we have many different structured PDFs (different templates of PDF which includes data). To process this data it is needed to capture (make Regions) to each PDF template separately. If we have 2-5 templates, it can be done quite easy but if we have 100 different PDFs ,better option is to do it manually.   Thank you  David  
13 REPLIES 13

Hi Fredrik,

Thank you for your help on XPDF. i was trying to use it in my code but it seems the expression giving me errors. could you please confirm if the argument input has right number of quotes.

it would ne great help.

Regards
Kes

------------------------------
Kesava Naidu Konanki
Data Migration developer
Agilisys
Europe/London
------------------------------

Hi Jyoti Prakash,

could you please provide me the argument input expression please?

Thanks in advance.

------------------------------
Kesava Naidu Konanki
Data Migration developer
Agilisys
Europe/London
------------------------------

Hi David,

could you please help me with the argument input expression with an example. i tried to use as Fredrik mentioned but its doing nothing.

Thanks

------------------------------
Kesava Naidu Konanki
Data Migration developer
Agilisys
Europe/London
------------------------------

Kes,

you can try the below arguments, it works.

/C start "" "C:\Program Files\Blue Prism Limited\Blue Prism Automate\xpdf-tools-win-4.04\bin64\pdftotext.exe" -layout "PDF Path"

Regards,

Kishore Kumar Reddy L

Lead Consultant, NTT Data



------------------------------
Kishore L
------------------------------