cancel
Showing results for 
Search instead for 
Did you mean: 

How to select Scanned PDF or Image Files to perform surface automation and fetch values

JIGARPARIKH
Level 4
How can we select Scanned PDF or Image Files to perform surface automation and fetch values? I want to perform OCR on Image files and scanned PDFs. Under which option, I can open these files under Region section and highlight values which I want to read from Images or scanned PDF?
4 REPLIES 4

hemant_chawla
Level 3
In the link below it has been told that something related to scanned docs OCR is available in version ""Blue Prism v4.2.47"". https://portal.blueprism.com/node/2534\. However in our case we are not able to access pdf in blue prism tool. We are able to open pdf document in browser but still not able to capture details. Please suggest a suitable method.

michel_lazzari
Level 2
Is there a possibility to read from scanned PDF's in version 5.0.5? I need to extract values from my scanned PDF's but I can't identify anything with the Region section.

ShreyansNahar
Level 5
Hi Micheal, Before trying to attach Adobe reader, please launch Adobe reader manually and then you will be able to get all elements of Adobe reader. Since it is a scanned pdf, the whole page will be a single image and you need to select the region as a whole. You will have to use OCR along with Font recognition technique given in the Surface Automation document.

Denis__Dennehy
Level 15
Just for Clarification, Blue Prism is not an OCR solution. There is an OCR facility within Blue Prism for reading text from screen regions, this technique could be used on a PDF if you can guarantee the quality and position of the text within your document. But, as with all OCR software, you need to test to ensure the quality of the OCR output can be guaranteed. My recommendation if you need to read text from multiple format documents of various quality is to look for a dedicated OCR data capture tool (rather than a basic OCR archiving tool) or to retain some manual staff to capture the data into a structured format that you can then give to Blue Prism to process.