cancel
Showing results for 
Search instead for 
Did you mean: 

Regarding PDF Automation

ThomasValiyapar
Level 2
In situations where you are faced with automating PDFs, is Blue Prism generally able to handle all the needs by itself or is it often that you need third-party software such as Google tesseract/Abby?
3 REPLIES 3

John__Carter
Staff
Staff
It depends on the type of PDF. If it contains digital data that you can either copy to clipboard or save as TXT, then usually you can use test parsing to get what you want. But if it's a scan/image then you need OCR. Some Tesseract functionality is embedded in BP, and it's possible to call Abbyy as a local DLL or as a web service. https://portal.blueprism.com/system/files/Guide%20-%20Interfacing%20wit…

Thanks for the response. I was wondering in this case about digital data.

There are also some quite helpful dll files for interacting with pdf documents. We are using itextsharp and pdfsharp for example to cut the pdf document in seperate pages or reading the text inside the pdf document.