Reading PDF picture with OCR and comparing text to...

Community
- Ask a Question
  
  Get fast, personalised help from other members of the community
- Create a Support Ticket
  
  Log an issue or request technical assistance from SS&C Blue Prism's support team.
- Discuss VBOs
  
  Ask for recommendations and give feedback on Virtual Business Objects from the Digital Exchange
- Use Cases
  
  Real-world examples of automations, submitted by our community
- Community FAQ
  
  Learn how to make the most of the SS&C Blue Prism Community here
- Find a User Group
  
  Find local and virtual user groups to meet and collaborate with
- International Forums
  
  Connect with the community in languages other than English
- Events Calendar
  
  See a schedule of in-person and online webinars, conferences and community-hosted events
- Community Blog
  
  Read the latest news, opinion and thought leadership from SS&C Blue Prism staff and community members
- Welcome Space
  
  Find community news, activities, and tips here. You can also introduce yourself!
- Request an Improvement
  
  Suggest new features and improvements, or vote on ideas submitted by others
- Latest Releases
  
  Learn about the latest releases for SS&C Blue Prism products.
- SS&C Blue Prism Products
  
  Discover our full product suite and browse community posts by product.
- Start Learning
  
  Access the SS&C Blue Prism University to start your learning journey
- Ask About Learning
  
  Share your learning goals and get course recommendations from the community
- Find Certifications
  
  Prove your skills with a range of SS&C Blue Prism certifications
- Become an MVP
  
  Join a team of volunteers who enjoy helping others succeed
Upgrades & Migrations

Hello everyone!

I'm currently facing a challenge in a client project where we have to read all the text from a PDF image (particularly a signed and scanned document) and compare it to the template source to spot any differences.

Our biggest issues right now are:

What is the best form to read the PDF? If we are going the OCR way, it will never be 100% accurate (and it has to, since we are comparing it to the original document to spot differences); plus, then we have to spy Adobe Reader, worry about zooming, scrolling down, etc.

How can we compare text and get a percentage of match? Is there any VBO available that does this?

We know there is third-party apps that can do this, like Abbyy, however we would like to first test non-third-party solutions before we go that route, since this document has sensitive data.

Thanks in advance for any help you may provide.

Best Regards,
André Sales.

------------------------------
André Sales Lopes
Consultant
EY
Europe/London
------------------------------

0 REPLIES 0

SS&C Blue Prism Community

Reading PDF picture with OCR and comparing text to template