PDF interaction
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
08-03-17 03:22 PM
Hi Everyone,
I have to automate a process where in a PDF of 400 pages I have to specifically find for few keywords and for every match for any of the keyword, the O/P should be :
>Keyword
>Complete sentence containing keyword
>Associated page number
>Nearest header at top of section.
Template of PDF is like :
Header 1 (in Bold)
Line1
Line2
.
.
.
Line 3
Header 2 (In Bold)
.
.
.
and so on...
Please advise how can I achieve this or search for text in between the headers without hardcoding
4 REPLIES 4
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
08-03-17 09:48 PM
Have you seen the guide in the learning area of the Portal called 'Interfacing with PDF Documents'?

Anonymous
Not applicable
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
02-05-17 05:46 PM
Hi Alekh, Denis,
How do we find the nearest header at the top of the section.??
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
12-10-20 08:36 AM
Hi,
Can one one please help me if we can extract data from web embedded pdf which is readable using BP 6.7
------------------------------
Zaheed Khan
Deputy Manager
WNS, Asia/Kolkata
------------------------------
Can one one please help me if we can extract data from web embedded pdf which is readable using BP 6.7
------------------------------
Zaheed Khan
Deputy Manager
WNS, Asia/Kolkata
------------------------------
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
13-10-20 02:33 AM
What do you mean by "readable using BP6.7"?
Generally speaking, if you are able to, by hand, copy and paste the text that you need from the PDF, then Blue Prism should be able to as well. If the PDF is actually an image (and not selectable text), then you'll need to use surface automation techniques and OCR in order to extract that data.
------------------------------
James Man
Professional Services
Blue Prism
Asia/Hong_Kong
------------------------------
Generally speaking, if you are able to, by hand, copy and paste the text that you need from the PDF, then Blue Prism should be able to as well. If the PDF is actually an image (and not selectable text), then you'll need to use surface automation techniques and OCR in order to extract that data.
------------------------------
James Man
Professional Services
Blue Prism
Asia/Hong_Kong
------------------------------
