OCR region nor marked correctly be Decipher

KrishnaElapavul · ‎26-05-21

Hi ,
When I am training a document in Decipher, I noticed a strange behavior against the regions marked by Decipher.
Can see the same in the attached screen with red and blue color lines. is there any additional settings needs to be done ?
I am doing simple rule based training ( not ML). Here expected one is as first line after the number " 0010/0020"
If you observe 1st line read properly
2nd line not read as it marked region for 3 lines in pdf
3rd line it mapped the value but wrongly and region marked for 2 lines
Any idea from any one ?

------------------------------
Krishna Elapavuluri
TEchnology Consultant
DXC.technology
Asia/Kolkata
------------------------------

Ben.Lyons1 · ‎26-05-21

Hi Krishna,

How many documents have you trained using this DFD?

Thanks

------------------------------
Ben Lyons
Product Consultant
Blue Prism
UK
------------------------------

Ben Lyons
Principal Product Specialist - Decipher
SS&C Blue Prism
UK based

KrishnaElapavul · ‎26-05-21

Hi Ben
Thanks for responding
I have trained 40+ documents all are different pdfs but same customer Still face same issue
I tried processing the same file after train--submit. But unfortunately I am getting same issue again while re-processing the same file

------------------------------
Krishna Elapavuluri
TEchnology Consultant
DXC.technology
Asia/Kolkata
------------------------------

Ben.Lyons1 · ‎26-05-21

Hi Krishna,

Are you able to use any Sample Headers for each column? e.g. If the document labels Column A as Reference, you can add this to Sample Headers for that column.

Also, have you set the respective fields to have a multi-line flag?

Thanks

------------------------------
Ben Lyons
Product Consultant
Blue Prism
UK
------------------------------

Ben Lyons
Principal Product Specialist - Decipher
SS&C Blue Prism
UK based

KrishnaElapavul · ‎26-05-21

Yes Ben
I have added sample headers in DFD
I have set the flag as " Assignable" not selected "Multiline"

------------------------------
Krishna Elapavuluri
TEchnology Consultant
DXC.technology
Asia/Kolkata
------------------------------

Ben.Lyons1 · ‎27-05-21

Hi Krishna,

I would recommend updating them to Multi-Line to start and then retrain them. You may need to delete your existing training data, however this will delete it for all documents and users in your environment. If this is not possible it will just take a little longer for the training to update.

There's no reason this should be happening or other hints I can provide. If you believe the product is not operating as expected, I would recommend raising a support ticket.

Thanks

------------------------------
Ben Lyons
Product Consultant
Blue Prism
UK
------------------------------

Ben Lyons
Principal Product Specialist - Decipher
SS&C Blue Prism
UK based

KrishnaElapavul · ‎27-05-21

Thanks Bens,
Yes I suggested same solution like will read as block and normalize at BP level , My customer is not happy with that approach. because customer wont to train the docs in production and to reduce technical dependency.

and further I notices some other bug with date conversion ..

If you observe the above image it find all values correctly but when I read the values i from BP Q for further process Date converted wrongly as below

Date-1 ---- Wrongly return different date got printed ( 24 printed instead of 14)

Date-2 -- Correctly return the value

This pattern like 1 and 3 lines came wrongly and 2nad 4 converted wrongly
I have set the column as Date and i will get different format for the same DFD.

------------------------------
Krishna Elapavuluri
TEchnology Consultant
DXC.technology
Asia/Kolkata
------------------------------

Ben.Lyons1 · ‎27-05-21

Hi Krishna,

You will experience difficulty training decipher to read multiple fields in a single table cell, I try image what it might look like as a table in Excel to see how I can extract it.

You can however create 1 field for the multi-line, then create an additional field for the required separation and use formulas in Decipher to extract the data. If you mark the original field as "Non-exportable", only the calculated fields will go to Blue Prism. So you have the table fields, "Full Details", "Line 1", "Line 2" and "Line 3", as well as any other table fields required.

I'm afraid I haven't seen that date issue before, please raise a support ticket and hopefully we can get to the bottom of it.

Regards

------------------------------
Ben Lyons
Product Consultant
Blue Prism
UK
------------------------------

Ben Lyons
Principal Product Specialist - Decipher
SS&C Blue Prism
UK based

KrishnaElapavul · ‎27-05-21

Hi Bens
thanks for quick reply

I tried some thing but have few questions

I have created an extra column ( corrected_L2 Autocalcualte) and applied formula as SUBSTR(FT_2_ITEMS_0,1,10) for testing
I have created L2 as assignable and multiline then i got value in new column as shown below
the value in L2 came as "G325007322VIERINTALAAKERI7322 BECBM"
Then I added a flag for new column as "Dynamic list " but still i am getting al 3 lines as one string
and if observe corrected column it is not calculated values as per L2 it printed first line value for all lines.

Any light can throw on this
Once again Thanks a lot for your suggestions

------------------------------
Krishna Elapavuluri RPA Solution Lead
TEchnology Consultant
DXC.technology
Asia/Kolkata
------------------------------

Ben.Lyons1 · ‎28-05-21

Hi Krishna,

I'm not sure why you've added the Dynamic List flag, this corresponds to a SQL database lookup function (you can get a value/list from a database based on a value read from the document, dynamically).

If you have the multi-line flag set, this should output the read field as it's written. I'm not aware of a formula in Decipher that can then split this by line break, but I'll check with the development team and come back to you.

Thanks

------------------------------
Ben Lyons
Product Consultant
Blue Prism
UK
------------------------------

Ben Lyons
Principal Product Specialist - Decipher
SS&C Blue Prism
UK based

SS&C Blue Prism Community

OCR region nor marked correctly be Decipher