cancel
Showing results for 
Search instead for 
Did you mean: 

Text from PDF Document to be split into collection

BrittanyHarding
Level 3
Hi Guys,
I could really use some help.

I need to move data (Address, Telephone Number, and  Organisation) from a PDF document to a Collection. To be used to verify information in a separate process.

Do you have any advise on this?
Thank you

------------------------------
Brittany Harding
------------------------------
7 REPLIES 7

ewilson
Staff
Staff
Hello @Brittany Harding,

There are a few ways you could handle this.

  1. I believe there's a some basic built-in OCR capability within Blue Prism (not referring to Decipher, but that's another option) that you might be able to leverage here. I haven't tried it myself, but I'm sure someone on the community can probably comment on it.
  2. There are various PDF tools available on the DX. Most of those will likely have some sort of cost associated with them. One example is the PDF Services Export asset.​ It's a wrapper around an Adobe Services REST API which can be used to export PDFs to other formats including Office document formats. You could then use the standard Blue Prism Office VBOs to work with the data. The catch here is that you need an Adobe subscription I believe unless you're just testing.
  3. I believe Microsoft Word will actually open a PDF and automatically convert it to a .DOCX, so you might want to give that a try.

Cheers,

------------------------------
Eric Wilson
Director, Integrations and Enablement
Blue Prism Digital Exchange
------------------------------

Thank you for replying, I am still new to BP so I am not sure that those steps would be easy for me to do. 
Within eRS - electronic referral service we need to gather the Registered Practice Details, which is a hyperlink that opens down to a popover window (screenshot 1&2). In the object created, I added a Navigate stage to click on the Hyperlink and a Read stage to Get Table Items (screenshot: 3).  

1.
26514.png2.
26515.png

3
26516.png
At the bottom you will see the error message that I get when trying to run this process (screenshot:4). I have tried spying in UIA as well and it does not work either – I believe it could just be our environment but I could be wrong. I don't know if anyone has has experience with this web application, but I'd really appreciate any help you can offer.
426517.png


------------------------------
Brittany Harding
------------------------------

Hi Brittany,

May I suggest to remove the pictures from points 1 and 2 of your examples above, and replace them with masked customer data. Publication of live customer data is in general frowned upon by the GDPR legislators.

------------------------------
Happy coding!
---------------
Paul
Sweden
------------------------------
Happy coding!
Paul, Sweden
(By all means, do not mark this as the best answer!)

Thank you, I have removed it.

------------------------------
Brittany Harding
------------------------------

Hi Brittany,
can you pls clarify the process? From the screenshots above it looks like the object is trying to read the data from a web page and not pdf, correct?


------------------------------
Konstantin Kazantsev
Solutions Architect
Church and Dwight
America/New_York
------------------------------

Right I can see how that is confusing. The original process was what I was struggling with, and because I was struggling with it I decided to put it aside and try something else. I just need to be able to copy and paste the UBRN Information: Registered Organization, Address and Telephone number. 
26532.png


------------------------------
Brittany Harding
------------------------------

for reading PDF data, there're many options some are free, we've tested many and been parsing PDFs for a few years in case you decide to come back to that solution.

For reading data from the web sites:
- API is the best way to interact with the web site.
- If API is not available:
   a) what is your Blue Prism version?
   b) is this internal web site or from a third party?

------------------------------
Konstantin Kazantsev
Solutions Architect
Church and Dwight
America/New_York
------------------------------