cancel
Showing results for 
Search instead for 
Did you mean: 

OCR region nor marked correctly be Decipher

KrishnaElapavul
Level 6
Hi ,
When I am training a document in Decipher, I noticed a strange behavior against the regions marked by Decipher.
Can see the same in the attached screen with red and blue color lines. is there any additional settings needs to be done ? 
I am doing simple rule based training ( not ML). Here expected one is as first line after the number " 0010/0020"
If you observe 1st line read properly 
2nd line not read as it marked region for 3 lines in pdf
3rd line it mapped the value but wrongly and region marked  for 2 lines
Any idea from any one ?


9876.png

------------------------------
Krishna Elapavuluri
TEchnology Consultant
DXC.technology
Asia/Kolkata
------------------------------
9 REPLIES 9

Ben.Lyons1
Staff
Staff
Hi Krishna,

How many documents have you trained using this DFD?

Thanks

------------------------------
Ben Lyons
Product Consultant
Blue Prism
UK
------------------------------
Ben Lyons
Principal Product Specialist - Decipher
SS&C Blue Prism
UK based

Hi Ben
Thanks for responding
I have trained 40+ documents  all are different pdfs but same customer   Still face same issue
I tried processing the same file after train--submit. But unfortunately I am getting same issue again  while re-processing the same file


------------------------------
Krishna Elapavuluri
TEchnology Consultant
DXC.technology
Asia/Kolkata
------------------------------

Hi Krishna,

Are you able to use any Sample Headers for each column? e.g. If the document labels Column A as Reference, you can add this to Sample Headers for that column.

Also, have you set the respective fields to have a multi-line flag?

Thanks

------------------------------
Ben Lyons
Product Consultant
Blue Prism
UK
------------------------------
Ben Lyons
Principal Product Specialist - Decipher
SS&C Blue Prism
UK based

Yes Ben
I have added sample headers in DFD
I have set the flag as " Assignable"  not selected "Multiline"

------------------------------
Krishna Elapavuluri
TEchnology Consultant
DXC.technology
Asia/Kolkata
------------------------------

Hi Krishna,

I would recommend updating them to Multi-Line to start and then retrain them. You may need to delete your existing training data, however this will delete it for all documents and users in your environment. If this is not possible it will just take a little longer for the training to update.

There's no reason this should be happening or other hints I can provide. If you believe the product is not operating as expected, I would recommend raising a support ticket.

Thanks

------------------------------
Ben Lyons
Product Consultant
Blue Prism
UK
------------------------------
Ben Lyons
Principal Product Specialist - Decipher
SS&C Blue Prism
UK based

Thanks Bens,
Yes I suggested same solution like will read as block and normalize at BP level , My customer is not happy with that approach. because customer wont to train the docs in production and to reduce technical dependency.

and further I notices some other bug  with date conversion ..
9845.pngIf you observe the above image it find all values correctly but when I read the values i from BP Q for further process Date converted wrongly as below 

Date-1   ---- Wrongly  return different date got printed  ( 24 printed instead of 14)
9846.png

Date-2  -- Correctly return the value

9847.pngThis pattern like 1 and  3 lines came wrongly and 2nad 4 converted  wrongly 
I have set the column as Date and i will get different  format for the same DFD.


------------------------------
Krishna Elapavuluri
TEchnology Consultant
DXC.technology
Asia/Kolkata
------------------------------

Hi Krishna,

You will experience difficulty training decipher to read multiple fields in a single table cell, I try image what it might look like as a table in Excel to see how I can extract it.

You can however create 1 field for the multi-line, then create an additional field for the required separation and use formulas in Decipher to extract the data. If you mark the original field as "Non-exportable", only the calculated fields will go to Blue Prism. So you have the table fields, "Full Details", "Line 1", "Line 2" and "Line 3", as well as any other table fields required.

I'm afraid I haven't seen that date issue before, please raise a support ticket and hopefully we can get to the bottom of it.

Regards

------------------------------
Ben Lyons
Product Consultant
Blue Prism
UK
------------------------------
Ben Lyons
Principal Product Specialist - Decipher
SS&C Blue Prism
UK based

Hi Bens 
thanks for quick reply 

I tried some thing but have few questions 
  • I have created an extra column ( corrected_L2 Autocalcualte) and applied formula as SUBSTR(FT_2_ITEMS_0,1,10) for testing
  • I have created L2 as assignable and multiline  then i got  value in new column as shown below 
  • the value in L2 came as  "G325007322VIERINTALAAKERI7322 BECBM"
  • Then I added a flag for new column as  "Dynamic list " but still i am getting  al 3 lines as one string
  • and if observe corrected column it is not calculated values as per L2 it printed first line value for all lines.
Any light can throw on this 
Once again Thanks  a lot for your suggestions 


9862.png

------------------------------
Krishna Elapavuluri RPA Solution Lead
TEchnology Consultant
DXC.technology
Asia/Kolkata
------------------------------

Hi Krishna,

I'm not sure why you've added the Dynamic List flag, this corresponds to a SQL database lookup function (you can get a value/list from a database based on a value read from the document, dynamically).

If you have the multi-line flag set, this should output the read field as it's written. I'm not aware of a formula in Decipher that can then split this by line break, but I'll check with the development team and come back to you.

Thanks

------------------------------
Ben Lyons
Product Consultant
Blue Prism
UK
------------------------------
Ben Lyons
Principal Product Specialist - Decipher
SS&C Blue Prism
UK based