<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Data Verification - Incorrect Data Extraction in Product Forum</title>
    <link>https://community.blueprism.com/t5/Product-Forum/Data-Verification-Incorrect-Data-Extraction/m-p/99125#M46745</link>
    <description>&lt;P&gt;Hi &lt;a href="https://community.blueprism.com/t5/user/viewprofilepage/user-id/51491"&gt;@Stephen__Guest&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;Can you tell us more about the data itself such as where the data is taken from, excel etc? It might help us understand whats causing the issue.&lt;/P&gt;</description>
    <pubDate>Tue, 05 Mar 2024 10:57:04 GMT</pubDate>
    <dc:creator>michaeloneil</dc:creator>
    <dc:date>2024-03-05T10:57:04Z</dc:date>
    <item>
      <title>Data Verification - Incorrect Data Extraction</title>
      <link>https://community.blueprism.com/t5/Product-Forum/Data-Verification-Incorrect-Data-Extraction/m-p/99124#M46744</link>
      <description>&lt;P&gt;Has anyone had the same Issue with zero being read as letter 'O' in fields that require a mixture of Letters &amp;amp; Number?&lt;/P&gt;</description>
      <pubDate>Tue, 27 Feb 2024 16:26:48 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Product-Forum/Data-Verification-Incorrect-Data-Extraction/m-p/99124#M46744</guid>
      <dc:creator>Stephen__Guest</dc:creator>
      <dc:date>2024-02-27T16:26:48Z</dc:date>
    </item>
    <item>
      <title>Re: Data Verification - Incorrect Data Extraction</title>
      <link>https://community.blueprism.com/t5/Product-Forum/Data-Verification-Incorrect-Data-Extraction/m-p/99125#M46745</link>
      <description>&lt;P&gt;Hi &lt;a href="https://community.blueprism.com/t5/user/viewprofilepage/user-id/51491"&gt;@Stephen__Guest&lt;/a&gt;&lt;/P&gt;
&lt;P&gt;Can you tell us more about the data itself such as where the data is taken from, excel etc? It might help us understand whats causing the issue.&lt;/P&gt;</description>
      <pubDate>Tue, 05 Mar 2024 10:57:04 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Product-Forum/Data-Verification-Incorrect-Data-Extraction/m-p/99125#M46745</guid>
      <dc:creator>michaeloneil</dc:creator>
      <dc:date>2024-03-05T10:57:04Z</dc:date>
    </item>
    <item>
      <title>Re: Data Verification - Incorrect Data Extraction</title>
      <link>https://community.blueprism.com/t5/Product-Forum/Data-Verification-Incorrect-Data-Extraction/m-p/99126#M46746</link>
      <description>&lt;P&gt;Hi &lt;A class="user-content-mention" data-sign="@" data-contactkey="ddd4d6a3-af39-4c64-8db4-cdb1d05ed669" data-tag-text="@Michael ONeil" href="https://community.blueprism.com/network/profile?UserKey=ddd4d6a3-af39-4c64-8db4-cdb1d05ed669" data-itemmentionkey="a9bbed71-d138-4508-a089-f92a5bb51049"&gt;@Michael ONeil&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;The documents are in PDF format &amp;amp; read as an Invoice Reference. For example for Invoice ref INV00126 could be captured in the verification stage as INVO0126.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 05 Mar 2024 12:19:47 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Product-Forum/Data-Verification-Incorrect-Data-Extraction/m-p/99126#M46746</guid>
      <dc:creator>Stephen__Guest</dc:creator>
      <dc:date>2024-03-05T12:19:47Z</dc:date>
    </item>
    <item>
      <title>Re: Data Verification - Incorrect Data Extraction</title>
      <link>https://community.blueprism.com/t5/Product-Forum/Data-Verification-Incorrect-Data-Extraction/m-p/99127#M46747</link>
      <description>&lt;P&gt;Ah from a pdf can be awkward, are you using ocr to identify and extract the information? have you tried using 'Get text' and regex to extract the information?&lt;/P&gt;</description>
      <pubDate>Tue, 05 Mar 2024 14:26:03 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Product-Forum/Data-Verification-Incorrect-Data-Extraction/m-p/99127#M46747</guid>
      <dc:creator>michaeloneil</dc:creator>
      <dc:date>2024-03-05T14:26:03Z</dc:date>
    </item>
    <item>
      <title>Re: Data Verification - Incorrect Data Extraction</title>
      <link>https://community.blueprism.com/t5/Product-Forum/Data-Verification-Incorrect-Data-Extraction/m-p/99128#M46748</link>
      <description>&lt;P&gt;Hi Michael,&lt;/P&gt;
&lt;P&gt;Regex works well when the field is structured, however, we see issues with O/0 and I/1 in the PDFs we process in Decipher as they contain a free-form "Reference" field provided by clients (alphanumeric, no standard length and sometimes containing dashes). Please elaborate on the "Get text" you mentioned as I can't find anything on it in the Decipher documentation and it sounds like something we should look into.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 06 Mar 2024 17:42:26 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Product-Forum/Data-Verification-Incorrect-Data-Extraction/m-p/99128#M46748</guid>
      <dc:creator>stuart.mar</dc:creator>
      <dc:date>2024-03-06T17:42:26Z</dc:date>
    </item>
    <item>
      <title>Re: Data Verification - Incorrect Data Extraction</title>
      <link>https://community.blueprism.com/t5/Product-Forum/Data-Verification-Incorrect-Data-Extraction/m-p/99129#M46749</link>
      <description>&lt;P&gt;Hi Stuart,&lt;/P&gt;
&lt;P&gt;This could be due to the document resolution (not necessarily the same thing as document quality). Decipher uses Tesseract OCR to read the text which is optimised for 300dpi, this is an important factor when considering how it's trying to read various fonts. So a font rendered at 300 dpi will have a slightly different appearance to one rendered at 250 dpi, this can cause similar characters to be mistaken. (Though it may also be due to a poor quality scan).&lt;/P&gt;
&lt;P&gt;If possible I would recommend using a Format Expression as Decipher can use this to better verify characters prone to this type of 'mistaken identity'. In this case perhaps the following expression would work "(INV[0-9]{5})". If&amp;nbsp; this would cause issues for other invoices you could set this up in a &lt;A href="https://bpdocs.blueprism.com/decipher-2-3/en-us/user-guide/specific-versions.htm?tocpath=Interface%7CAdmin%20panel%7CDocument%20form%20definitions%7C_____2"&gt;Specific Version&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;Thanks&lt;/P&gt;
&lt;P&gt;Ben&lt;/P&gt;</description>
      <pubDate>Thu, 07 Mar 2024 07:59:27 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Product-Forum/Data-Verification-Incorrect-Data-Extraction/m-p/99129#M46749</guid>
      <dc:creator>Ben.Lyons1</dc:creator>
      <dc:date>2024-03-07T07:59:27Z</dc:date>
    </item>
  </channel>
</rss>

