cancel
Showing results for 
Search instead for 
Did you mean: 

OCR region marking by Decipher

KrishnaElapavul
Level 6
Hi all
I  would like- to understand  How Decipher mark the  OCR regions  in PDF processing ?
  1. is it  for each line as one region ? but  this is not happening always 
  2. Is  it for each word (how it guess each word- based on space) ? but  this is not happening always I wonder what logic uses for this region marking by Decipher ?
Please share your views

------------------------------
Krishna Elapavuluri
TEchnology Consultant
DXC.technology
Asia/Kolkata
------------------------------
1 REPLY 1

Ben.Lyons1
Staff
Staff
Hi Krishna,

The OCR stage doesn't define what is marked in the document, this is done during the capture stage.

The OCR engine looks to recognise what is text and what characters they are/could be. The capture client uses this information with the training data to determine what will be highlighted as a region. So it will have a general idea based on the spacing of the text and layout, but following training it will update how some of the regions are separated.

E.g. without training it may outline "Invoice No: 0123456", but once trained it may separate those if you have previously selected "0123456".

Regards

------------------------------
Ben Lyons
Product Consultant
Blue Prism
UK
------------------------------
Ben Lyons
Principal Product Specialist - Decipher
SS&C Blue Prism
UK based