Showing results for 
Search instead for 
Did you mean: 

How can I read values from PDF, Word, PPT etc in Blue Prism?

Level 4
How can I read values from PDF, Word, PPT etc. in Blue Prism?

Level 15
There is a Word VBO (similar to the Excel VBO you have already used) distributed in the installation folder of the Blue Prism product. To interface with a PDF you have two options, either simply interface with the PDF like any other Windows application, and use Surface Automation to select all and copy all the text in the PDF to clipboard. The other option is to purchase Adobe pro and it is then possible to interface with Adobe to export the pdf as a word or xml document to interface with. I do not know of any PPT interfaces

Level 4
Thanks for your reply. Thanks for the Word VBO information. Regarding PDF, I am not able to access PDF object using Desktop Application. I don't know with which object I can open my PDF file directly. Do you have any idea? I have tried with opening PDF exe and try to open my document using Application modular from file menu. I am not able to access PDF menus and other controls. And I do not want to use surface automation for this because I am not sure how the OCR quality will be? Can you comment on How strong the OCR engine is on BP? Is it able to read correct values from scanned PDFs? What is the accuracy percentage for the same? For PPT I am fine. I will access it as Desktop application. Can you please also let me know how can we open specific PDF or any other file? Using which object we can select JPG or PDF file?

Level 15
Do you have Adobe Reader installed on your PC? If so you should be able to launch it as you would any other application and use Surface Automation (if you have done that training) to interface with it.

Level 2
You can open the .pdf file directly to adobe reader by entering it for command line parameters. Example: Application name: Open adobe pdf Application path: C:\Program Files (x86)\Adobe\Acrobat Reader DC\Reader\AcroRd32.exe Command line parameters: ""c:\Test_folder\test.pdf"" I also cannot get blueprism to recognise any components inside adobe reader window, everytime when I try to identify ""with any of the 3 modes, and the application is defined as windows application"" blueprism pops out and error messagebox saying: There was an error during the spying operation. System.ApplicationException: The window spied was not found in the model kohteessa BluePrism.AMI.clsAMI.Spy(clsElementTypeInfo& elementType, List`1& identifiers) kohteessa Automate.frmIntegrationAssistant.HandleSpyOrLaunchClick(Object sender, EventArgs e) ""Those kohteessa words are finnish and in this case are translated as ""IN"""" -Pete

Level 4
Thanks Guys for your reply. I get how can I open any PDF files. From your reply, one point I can understand is ""Blue Prism can only read content from PDF via Surface Automation Technique (OCR). Not sure if we can read entire table via OCR technique and get entire table in the Collection. Can someone please confirm this?

Level 3
It's not too hard to use a third party command line tool to extract the text from the PDF. I've been using the command line tool from the Apache pdfBox, and it's super fast. My normal workflow is to convert the PDF to text on the command line and read the text using the default Utility objects. Providing all you need is the content, it's certainly easier and less fragile than automating Acrobat. -rj

Level 2
I have been trying to create an object to open a PDF file from a hyper Link and save it to a specific location. While doing it I had a Save AS Pop Up window which I couldn’t Spy using both modes(AA, Windows 32). Also, I did create an Acrobat Reade(11.0) business object and tried to attach it to the displayed PDF file so I can model the pop up message, nut still have the same BP Error message. I appreciate your help for this matter.

Level 15
For your Save As popup you may need to attach using a seperate business object, or you may need to use Surface Automation technique.s

Level 2
I have been using trial version for  Adobe pro and it is then possible with Adobe to export the pdf as a word or Excel. As we are unable to spy ADOBE pro , just launch it through app modeller and through global send keys [""%F"" , T , S ,""{ENTER}""] , export it to word .