<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic RE: OCR region marking by Decipher in Product Forum</title>
    <link>https://community.blueprism.com/t5/Product-Forum/OCR-region-marking-by-Decipher/m-p/70910#M23515</link>
    <description>Hi Krishna,&lt;BR /&gt;&lt;BR /&gt;The OCR stage doesn't define what is marked in the document, this is done during the capture stage.&lt;BR /&gt;&lt;BR /&gt;The OCR engine looks to recognise what is text and what characters they are/could be. The capture client uses this information with the training data to determine what will be highlighted as a region. So it will have a general idea based on the spacing of the text and layout, but following training it will update how some of the regions are separated.&lt;BR /&gt;&lt;BR /&gt;E.g. without training it may outline "Invoice No: 0123456", but once trained it may separate those if you have previously selected "0123456".&lt;BR /&gt;&lt;BR /&gt;Regards&lt;BR /&gt;&lt;BR /&gt;------------------------------&lt;BR /&gt;Ben Lyons&lt;BR /&gt;Product Consultant&lt;BR /&gt;Blue Prism&lt;BR /&gt;UK&lt;BR /&gt;------------------------------&lt;BR /&gt;</description>
    <pubDate>Thu, 27 May 2021 07:23:00 GMT</pubDate>
    <dc:creator>Ben.Lyons1</dc:creator>
    <dc:date>2021-05-27T07:23:00Z</dc:date>
    <item>
      <title>OCR region marking by Decipher</title>
      <link>https://community.blueprism.com/t5/Product-Forum/OCR-region-marking-by-Decipher/m-p/70909#M23514</link>
      <description>Hi all&lt;BR /&gt;I&amp;nbsp; would like- to understand&amp;nbsp; How Decipher mark the&amp;nbsp; OCR regions&amp;nbsp; in PDF processing ?&lt;BR /&gt;
&lt;OL&gt;
&lt;LI&gt;is it&amp;nbsp; for each line as one region ? but&amp;nbsp; this is not happening always&amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;Is&amp;nbsp; it for each word (how it guess each word- based on space) ? but&amp;nbsp; this is not happening always I wonder what logic uses for this region marking by Decipher ?&lt;/LI&gt;
&lt;/OL&gt;
Please share your views&lt;BR /&gt;&lt;BR /&gt;------------------------------&lt;BR /&gt;Krishna Elapavuluri&lt;BR /&gt;TEchnology Consultant&lt;BR /&gt;DXC.technology&lt;BR /&gt;Asia/Kolkata&lt;BR /&gt;------------------------------&lt;BR /&gt;</description>
      <pubDate>Wed, 26 May 2021 09:00:00 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Product-Forum/OCR-region-marking-by-Decipher/m-p/70909#M23514</guid>
      <dc:creator>KrishnaElapavul</dc:creator>
      <dc:date>2021-05-26T09:00:00Z</dc:date>
    </item>
    <item>
      <title>RE: OCR region marking by Decipher</title>
      <link>https://community.blueprism.com/t5/Product-Forum/OCR-region-marking-by-Decipher/m-p/70910#M23515</link>
      <description>Hi Krishna,&lt;BR /&gt;&lt;BR /&gt;The OCR stage doesn't define what is marked in the document, this is done during the capture stage.&lt;BR /&gt;&lt;BR /&gt;The OCR engine looks to recognise what is text and what characters they are/could be. The capture client uses this information with the training data to determine what will be highlighted as a region. So it will have a general idea based on the spacing of the text and layout, but following training it will update how some of the regions are separated.&lt;BR /&gt;&lt;BR /&gt;E.g. without training it may outline "Invoice No: 0123456", but once trained it may separate those if you have previously selected "0123456".&lt;BR /&gt;&lt;BR /&gt;Regards&lt;BR /&gt;&lt;BR /&gt;------------------------------&lt;BR /&gt;Ben Lyons&lt;BR /&gt;Product Consultant&lt;BR /&gt;Blue Prism&lt;BR /&gt;UK&lt;BR /&gt;------------------------------&lt;BR /&gt;</description>
      <pubDate>Thu, 27 May 2021 07:23:00 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Product-Forum/OCR-region-marking-by-Decipher/m-p/70910#M23515</guid>
      <dc:creator>Ben.Lyons1</dc:creator>
      <dc:date>2021-05-27T07:23:00Z</dc:date>
    </item>
  </channel>
</rss>

