cancel
Showing results for 
Search instead for 
Did you mean: 

Azure Form Recognizer Analyze Form - PDF

TahaSonmez
Level 5
Hi everyone,

I'm trying to integrate Blue Prism with Microsoft Azure Form Recognizer. I'm using this skill from Digital Exchange.

I have PDFs in machine and I want to analyze them. When I send the request with Base64 string of PDF, it gives an error. I don't know encoding PDF file to base64 is the correct way.
8342.png


The error:

8344.png
And how do I convert PDF to Base64;

8346.png

Any suggestions?

------------------------------
Taha Sonmez
------------------------------
7 REPLIES 7

GopalBhaire
Level 10

Hi,


I think you need to send the pdf data as binary. You can use LoadBinaryFile function in calc stage to do that.


Thanks,

Gopal



------------------------------
Gopal Bhaire
Analyst
Accenture
------------------------------

AminPatel1
Level 5
Hi,

There are two ways -
  1. First you have to change Form Recognizer Client Service WebAPI's Analyze Form request to accept single file instead of default template {"source":"[source]"} available. Then you need to convert the pdf data as binary using calculate stage with expression LoadBinaryFile(source-file). OR
  2. In Analyze Form action inputs change ContentType to "application/json" and pass the url for pdf document in source parameter without decorating "".


------------------------------
Amin Patel
Intelligent Automation Developer
Emerson
Pune India
------------------------------
Amin Patel Intelligent Automation Developer Emerson Pune India

Hi @AminPatel1​ and @Taha Sonmez

I am unable to consume this web api with binary data. I changed the body content of request as 'Single File'. And the content-type is application\pdf. I am sending binary data but an error occurs.

Internal : Unexpected error Error during Web API HTTP Request
HTTP Status Code: 400
HTTP Response Content: {"error":{"code":"1002","message":"Analyze request is either invalid or missing required parameters. Refer to the API reference and retry your request."}}

Can you please help me?




------------------------------
John L. Lehman
------------------------------

@John_L_Lehman could you please share inputs data screenshot.​

------------------------------
Amin Patel
Intelligent Automation Developer
Emerson
Pune India
------------------------------
Amin Patel Intelligent Automation Developer Emerson Pune India

Hi,

8309.png8310.png
8311.png

------------------------------
John L. Lehman
------------------------------

Try setting the content-type to application/octet-stream when sending the Single File in the request. There is also a sample implementation of the same here https://digitalexchange.blueprism.com/dx/entry/3439/solution/microsoft-computer-vision-api-v30-preview . This url is for version 3 of Azure Form Recognizer

------------------------------
Shashank Kumar
DX Integrations Partner Consultant
Blue Prism
Singapore
+6581326707
------------------------------

Hello @John_L_Lehman,

PFB parameters datatype required in Analyze Form request -

1.1.           Analyze Form

Analyze Form extracts key-value pairs, tables, and semantic values from a given document (resultId) for use with the "Get Analyze Form Result" action. The input document must be of one of the supported content types - 'application/pdf', 'image/jpeg', 'image/png' or 'image/tiff'. Alternatively, use 'application/json' type to specify the Url location of the document to be analyzed.

Note: When configuring source (media type), please note the default value is application/json; you will need to specify the Url location of the document to be analysed. All other media types need to the raw image binary of the file type.

 

1.1.1.      Request

Parameter

Direction

Data Type

Description

modelId

In

Text

String Format - uuid. Model identifier

includeTextDetails

In

boolean

Include text lines and element references in the result. Default: false.

ContentType

In

Text

Media type of the body sent to the API. Default: application/json

source

In

Text

Request body - .json, .pdf, .jpg, .png or .tiff type file stream.

 
Change datatype for includeTextDetails from text to boolean in your API request.
Let me know if this works.




------------------------------
Amin Patel
Intelligent Automation Developer
Emerson
Pune India
------------------------------
Amin Patel Intelligent Automation Developer Emerson Pune India