OCR region marking by Decipher
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
26-05-21 10:00 AM
Hi all
I would like- to understand How Decipher mark the OCR regions in PDF processing ?
------------------------------
Krishna Elapavuluri
TEchnology Consultant
DXC.technology
Asia/Kolkata
------------------------------
I would like- to understand How Decipher mark the OCR regions in PDF processing ?
- is it for each line as one region ? but this is not happening always
- Is it for each word (how it guess each word- based on space) ? but this is not happening always I wonder what logic uses for this region marking by Decipher ?
------------------------------
Krishna Elapavuluri
TEchnology Consultant
DXC.technology
Asia/Kolkata
------------------------------
1 REPLY 1
Options
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content
27-05-21 08:23 AM
Hi Krishna,
The OCR stage doesn't define what is marked in the document, this is done during the capture stage.
The OCR engine looks to recognise what is text and what characters they are/could be. The capture client uses this information with the training data to determine what will be highlighted as a region. So it will have a general idea based on the spacing of the text and layout, but following training it will update how some of the regions are separated.
E.g. without training it may outline "Invoice No: 0123456", but once trained it may separate those if you have previously selected "0123456".
Regards
------------------------------
Ben Lyons
Product Consultant
Blue Prism
UK
------------------------------
The OCR stage doesn't define what is marked in the document, this is done during the capture stage.
The OCR engine looks to recognise what is text and what characters they are/could be. The capture client uses this information with the training data to determine what will be highlighted as a region. So it will have a general idea based on the spacing of the text and layout, but following training it will update how some of the regions are separated.
E.g. without training it may outline "Invoice No: 0123456", but once trained it may separate those if you have previously selected "0123456".
Regards
------------------------------
Ben Lyons
Product Consultant
Blue Prism
UK
------------------------------
Ben Lyons
Principal Product Specialist - Decipher
SS&C Blue Prism
UK based
Principal Product Specialist - Decipher
SS&C Blue Prism
UK based
