02-02-23 04:03 AM
02-02-23 07:16 AM
21-02-23 05:41 AM
Hi Paul
Can you please provide the DX link of the object?
Many thanks in advance
21-02-23 11:15 AM
Hi Manish,
I wrote '...I think...' implying I am not sure as we do not use any DX objects in our shop.
That said, My '...I know...' is based on earlier posts on this subject in this community, so some googling on your side will probably unearth clues as to where to find any such DX object.
22-02-23 08:50 AM
Hi Amrutha,
I got one process last year where we had to extract some data from the pdf files. I used alternative way to do this task. I converted pdf files to excel file and then with the help of excel utility I read cells value.
you can also try this method.
22-02-23 01:34 PM
Hi Amrutha,
Last year, I worked on the automation where I have to update and extract the data from PDF forms. I have used C# code and Itextsharp dll for this use case.
Please find below the details -
Inputs - filePath(Text)
Outputs - outputText(Text), Success(Flag), Message(Text)
Code -
Success = true;
Message = "";
outputText = "";
StringBuilder text = new StringBuilder();
PdfReader pdfReader = null;
var pdf_filename = filePath;
try{
pdfReader = new PdfReader(pdf_filename);
{
var fields = pdfReader.AcroFields.Fields;
foreach (var key in fields.Keys)
{
var value = pdfReader.AcroFields.GetField(key);
text.Append(key+"----"+value+";");
}
outputText = text.ToString();
}
}
catch(Exception exx) {
Success = false;
Message = exx.Message;
}
finally {
if (pdfReader != null)
{
pdfReader.Close();
}
}
You will get the details in text data item and after that use the split text with character ;( as mentioned in code - text.Append(key+"----"+value+";")).
Also, you need to import the dlls in code option -
Please let me know if you need any additional information.
23-02-23 03:32 AM
Thank you Sahil.
I tried your approach unfortunately the Excel is reading some fields as image and its not returning structured data. I'm getting a mix of image and text values for PDF to Excel conversion.
23-02-23 03:43 AM
Thanks a lot for your detailed explanation. I truly appreciate your effort.
I would like to try out the method you have suggested. If you don't mind can you share me the authenticated URLs for downloading the DLLs?
I had tried using BP objects from Digital exchange and worked on few python codes to read the PDF. Since the PDF is editable, its unable to read the field values and is able to read the field labels alone.
23-02-23 03:45 AM
Thanks Paul for your suggestion.
I tried few objects from DX and tried converting the PDF into word and excel. It is not able to extract the data and the information is read either as image or blank values as the PDF is editable form.
23-02-23 07:55 AM
Hi @Amrutha Sivarajan ,
Did you try opening the pdf file in chrome or any other browser? Opening a file using a browser sometimes helps in spying the relevant elements and you can try reading the checkbox values.