Regarding PDF Automation
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
04-12-17 08:22 PM
In situations where you are faced with automating PDFs, is Blue Prism generally able to handle all the needs by itself or is it often that you need third-party software such as Google tesseract/Abby?
3 REPLIES 3
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
04-12-17 10:34 PM
It depends on the type of PDF. If it contains digital data that you can either copy to clipboard or save as TXT, then usually you can use test parsing to get what you want. But if it's a scan/image then you need OCR. Some Tesseract functionality is embedded in BP, and it's possible to call Abbyy as a local DLL or as a web service.
https://portal.blueprism.com/system/files/Guide%20-%20Interfacing%20wit…
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
12-12-17 12:31 AM
Thanks for the response. I was wondering in this case about digital data.
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
13-12-17 03:41 PM
There are also some quite helpful dll files for interacting with pdf documents. We are using itextsharp and pdfsharp for example to cut the pdf document in seperate pages or reading the text inside the pdf document.
