cancel
Showing results for 
Search instead for 
Did you mean: 

Issue in capturing table Line-Item data.

kiranb25
Level 4

Hi All,

I am having a table in a pdf doc where I have to capture the columns data, for one of the columns may have single row or multiple rows (1-8 rows) basically it's a product description. After training multiple docs (~50) Still i am not seeing data is captured correctly.

Eg: If the description column having 5 lines, it is capturing only 1,2or3 lines assuming it is having only that much of data as per other documents. if only one line is there it is capturing correctly.

Any suggestions to achieve capturing entire description correctly all the time (min 80% accuracy). I am keeping UTD = true already in dfd.

Note: table may contain multiple line items. each line-item description varies from 1 line to 8 lines.

Thanks,



------------------------------
kiran B
------------------------------
6 REPLIES 6

Hi Kiran,

Have you set your Product Description field data type as "MultiLine" ?



------------------------------
Athiban Mahamathi - https://www.linkedin.com/in/athiban-mahamathi-544a008b/
Technical Consultant,
SimplifyNext PTE LTD,
Singapore
------------------------------

Hi Athiban Mahamathi,

yes, I did.

Thanks,



------------------------------
kiran B
------------------------------

Hi Kiran,

I was also facing a similar issue for Bank statements where the transaction description was quite long sometimes. I followed the below steps and it worked for me. May be you can give it a try.

  • Delete the training data (In case if you are having already trained data for different DFD's then you can download the existing training data as back up and delete it, upon completing the training you can upload the backed up training data to append with the current one)
  • Create the DFD with only the table field name and data type without any formulas /expressions or conditions
  • Now start your training with simple document which has few lines as description and you can upload the troublesome documents in the next set to see if the values are picking up. (Slowly you can enable the formulas/expressions)

Please let me know if you need any more help.

 



------------------------------
Athiban Mahamathi - https://www.linkedin.com/in/athiban-mahamathi-544a008b/
Technical Consultant,
SimplifyNext PTE LTD,
Singapore
------------------------------

Sure Athiban,

 Will give it a try.

Thanks,



------------------------------
kiran B
------------------------------

Hi Kiran,

Were you able to extract the description? if you found a better way to resolve your issue, please do share here 



------------------------------
Athiban Mahamathi - https://www.linkedin.com/in/athiban-mahamathi-544a008b/
Technical Consultant,
SimplifyNext PTE LTD,
Singapore
------------------------------

Hi Athiban Mahamathi,

 Currently what I am doing is, created several batches based on no. of description lines. for example, All the 3 row description pdf as separate batch and trying few documents as training set and testing the other batch to see how it is capturing. likewise for the others.

As of now for few it is capturing correctly(50 to 60%) but not for all. still trying to see other options like miscellaneous params like border table on since most of the documents are having single line item with multiple rows. 

Note: I feel the way description updated in the pdf doc is bit complex to segregate as certain batch because "rows are not in a single format like equal distance between the rows , for few doc it is like 2 rows as extended text close to each row and empty space of 2 lines then again 3 rows of data.

Appreciate if you have any suggestions for the same.

Thanks,



------------------------------
kiran B
------------------------------