<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic RE: pdf data extraction in Digital Exchange</title>
    <link>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58369#M1634</link>
    <description>&lt;a href="https://community.blueprism.com/t5/user/viewprofilepage/user-id/988"&gt;@aseelodeh&lt;/a&gt;,&lt;BR /&gt;&lt;BR /&gt;The PDF Toolkit requires an account with Adobe Document Cloud. You can sign up for a free developer account with them for testing.&lt;BR /&gt;&lt;BR /&gt;Cheers,&lt;BR /&gt;​&lt;BR /&gt;&lt;BR /&gt;------------------------------&lt;BR /&gt;Eric Wilson&lt;BR /&gt;Director, Partner Integrations for Digital Exchange&lt;BR /&gt;Blue Prism&lt;BR /&gt;------------------------------&lt;BR /&gt;</description>
    <pubDate>Wed, 07 Jul 2021 12:38:00 GMT</pubDate>
    <dc:creator>ewilson</dc:creator>
    <dc:date>2021-07-07T12:38:00Z</dc:date>
    <item>
      <title>pdf data extraction</title>
      <link>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58361#M1626</link>
      <description>hello, what is the best free way for extracting data from PDF?&lt;BR /&gt;&lt;BR /&gt;------------------------------&lt;BR /&gt;aseel odeh&lt;BR /&gt;------------------------------&lt;BR /&gt;</description>
      <pubDate>Sun, 04 Jul 2021 12:38:00 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58361#M1626</guid>
      <dc:creator>aseelodeh</dc:creator>
      <dc:date>2021-07-04T12:38:00Z</dc:date>
    </item>
    <item>
      <title>RE: pdf data extraction</title>
      <link>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58362#M1627</link>
      <description>You can look into Decipher-IDP a product of Blue Prism, which helps extract data from PDF.&lt;BR /&gt;&lt;BR /&gt;&lt;A href="https://portal.blueprism.com/product/related-products/blue-prism-decipher-idp-11" target="test_blank"&gt;https://portal.blueprism.com/product/related-products/blue-prism-decipher-idp-11&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;------------------------------&lt;BR /&gt;Sai Devendra Kumar Komma&lt;BR /&gt;------------------------------&lt;BR /&gt;</description>
      <pubDate>Mon, 05 Jul 2021 03:22:00 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58362#M1627</guid>
      <dc:creator>Sai_Devendra_Ku</dc:creator>
      <dc:date>2021-07-05T03:22:00Z</dc:date>
    </item>
    <item>
      <title>RE: pdf data extraction</title>
      <link>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58363#M1628</link>
      <description>&lt;a href="https://community.blueprism.com/t5/user/viewprofilepage/user-id/988"&gt;@aseelodeh&lt;/a&gt;,&amp;nbsp; The best option is Decipher, but if it's someone like a string, dumb thing, you can copy the content to a data item and do a regex for the desired value, if you just need to validate if a word exists, use InStr()​&lt;BR /&gt;&lt;BR /&gt;------------------------------&lt;BR /&gt;Emerson Ferreira&lt;BR /&gt;Sr Business Analyst&lt;BR /&gt;Avanade Brasil&lt;BR /&gt;+55 (081) 98886-9544&lt;BR /&gt;If my answer helped you? Mark as useful!&lt;BR /&gt;------------------------------&lt;BR /&gt;</description>
      <pubDate>Mon, 05 Jul 2021 13:37:00 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58363#M1628</guid>
      <dc:creator>EmersonF</dc:creator>
      <dc:date>2021-07-05T13:37:00Z</dc:date>
    </item>
    <item>
      <title>RE: pdf data extraction</title>
      <link>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58364#M1629</link>
      <description>yes, as you know copying data from PDF extracts the text without formatting, do you have a way other than regEX for processing the data in an excel file? i need it for multiple different files&lt;BR /&gt;&lt;BR /&gt;------------------------------&lt;BR /&gt;aseel odeh&lt;BR /&gt;------------------------------&lt;BR /&gt;</description>
      <pubDate>Tue, 06 Jul 2021 10:03:00 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58364#M1629</guid>
      <dc:creator>aseelodeh</dc:creator>
      <dc:date>2021-07-06T10:03:00Z</dc:date>
    </item>
    <item>
      <title>RE: pdf data extraction</title>
      <link>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58365#M1630</link>
      <description>There are various ways to extract data from PDFs. The "best" way depends on your specific use case and the make up of the PDFs that you'll be dealing with. Some examples have been mentioned above. Additional examples for extracting data include:&lt;BR /&gt;
&lt;UL&gt;
&lt;LI&gt;Use the &lt;A href="https://digitalexchange.blueprism.com/dx/entry/9648/solution/blue-prism---pdf-toolkit" target="_blank" rel="noopener"&gt;PDF Toolkit&lt;/A&gt; from the DX to convert the PDF to a Word doc and then use the MS Word VBO to work with the contents.&lt;/LI&gt;
&lt;LI&gt;Use the open source &lt;A href="https://www.xpdfreader.com/download.html" target="_blank" rel="noopener"&gt;Xpdf Tools&lt;/A&gt; to convert a PDF to text and then use the Strings utility VBO to work with the text.&lt;/LI&gt;
&lt;/UL&gt;
Cheers,&lt;BR /&gt;&lt;BR /&gt;------------------------------&lt;BR /&gt;Eric Wilson&lt;BR /&gt;Director, Partner Integrations for Digital Exchange&lt;BR /&gt;Blue Prism&lt;BR /&gt;------------------------------&lt;BR /&gt;</description>
      <pubDate>Tue, 06 Jul 2021 13:26:00 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58365#M1630</guid>
      <dc:creator>ewilson</dc:creator>
      <dc:date>2021-07-06T13:26:00Z</dc:date>
    </item>
    <item>
      <title>RE: pdf data extraction</title>
      <link>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58366#M1631</link>
      <description>ok, &lt;BR /&gt;if the PDF is editable and can be copied, do you have a method for integrating and processing data into excel?&lt;BR /&gt;notice that I have different PDF formats&lt;BR /&gt;&lt;BR /&gt;------------------------------&lt;BR /&gt;aseel odeh&lt;BR /&gt;------------------------------&lt;BR /&gt;</description>
      <pubDate>Tue, 06 Jul 2021 14:08:00 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58366#M1631</guid>
      <dc:creator>aseelodeh</dc:creator>
      <dc:date>2021-07-06T14:08:00Z</dc:date>
    </item>
    <item>
      <title>RE: pdf data extraction</title>
      <link>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58367#M1632</link>
      <description>&lt;a href="https://community.blueprism.com/t5/user/viewprofilepage/user-id/988"&gt;@aseelodeh&lt;/a&gt;,&lt;BR /&gt;&lt;BR /&gt;The PDF Toolkit, mentioned above, uses Adobe's Document Cloud platform. There's an action in the VBO called &lt;EM&gt;&lt;STRONG&gt;ExportPDFToDocx&lt;/STRONG&gt;&lt;/EM&gt;. You could copy that action into a new action and then change the following line of code in the code stage and I believe it would export the input PDF as an XLSX file.&lt;BR /&gt;&lt;BR /&gt;&lt;span class="lia-inline-image-display-wrapper" image-alt="7529.png"&gt;&lt;img src="https://community.blueprism.com/t5/image/serverpage/image-id/7705i1D53C4793279ACD2/image-size/large?v=v2&amp;amp;px=999" role="button" title="7529.png" alt="7529.png" /&gt;&lt;/span&gt;&lt;BR /&gt;Change the above highlighted line to this:&lt;BR /&gt;&lt;BR /&gt;&lt;EM&gt;&lt;STRONG&gt;ExportPDFOperation exportPdfOperation = ExportPDFOperation.CreateNew(ExportPDFTargetFormat.XLSX);&lt;/STRONG&gt;&lt;/EM&gt;&lt;BR /&gt;&lt;BR /&gt;Cheers,&lt;BR /&gt;​&lt;BR /&gt;&lt;BR /&gt;------------------------------&lt;BR /&gt;Eric Wilson&lt;BR /&gt;Director, Partner Integrations for Digital Exchange&lt;BR /&gt;Blue Prism&lt;BR /&gt;------------------------------&lt;BR /&gt;</description>
      <pubDate>Tue, 06 Jul 2021 17:28:00 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58367#M1632</guid>
      <dc:creator>ewilson</dc:creator>
      <dc:date>2021-07-06T17:28:00Z</dc:date>
    </item>
    <item>
      <title>RE: pdf data extraction</title>
      <link>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58368#M1633</link>
      <description>does keeping the "CredentialsFilePath" Empty causing an error? if yes what meant by this? I have no credentials for PDF reader&lt;BR /&gt;&lt;BR /&gt;------------------------------&lt;BR /&gt;aseel odeh&lt;BR /&gt;------------------------------&lt;BR /&gt;</description>
      <pubDate>Tue, 06 Jul 2021 18:02:00 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58368#M1633</guid>
      <dc:creator>aseelodeh</dc:creator>
      <dc:date>2021-07-06T18:02:00Z</dc:date>
    </item>
    <item>
      <title>RE: pdf data extraction</title>
      <link>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58369#M1634</link>
      <description>&lt;a href="https://community.blueprism.com/t5/user/viewprofilepage/user-id/988"&gt;@aseelodeh&lt;/a&gt;,&lt;BR /&gt;&lt;BR /&gt;The PDF Toolkit requires an account with Adobe Document Cloud. You can sign up for a free developer account with them for testing.&lt;BR /&gt;&lt;BR /&gt;Cheers,&lt;BR /&gt;​&lt;BR /&gt;&lt;BR /&gt;------------------------------&lt;BR /&gt;Eric Wilson&lt;BR /&gt;Director, Partner Integrations for Digital Exchange&lt;BR /&gt;Blue Prism&lt;BR /&gt;------------------------------&lt;BR /&gt;</description>
      <pubDate>Wed, 07 Jul 2021 12:38:00 GMT</pubDate>
      <guid>https://community.blueprism.com/t5/Digital-Exchange/pdf-data-extraction/m-p/58369#M1634</guid>
      <dc:creator>ewilson</dc:creator>
      <dc:date>2021-07-07T12:38:00Z</dc:date>
    </item>
  </channel>
</rss>

