11-11-23 07:57 AM
Hi @Ben.Lyons1,
I have a set of 11 images with same structure, for the first time i assigned the regions for each field during manual verification but when the second time i passed the same 11 images still its wrongly selecting the regions, its not learning from what i corrected previously.
1.
For The Company Name its considering only first line, in the previous batch i already assigned the region to entire text box but still its considering only 1st line. I have selected the flag Multiline as well.
Same issue with Company Profile Field also, its not extracting the data after First line and if we assign the manually also the text is not correct like a is extracted as @, some new characters are coming which are not present in the image also.
2.
The PE Ratio field is mapping to Sedol it should map to correct once as i mentioned the exact keyword in DFD.
The characters like O,I are being read as Zero(0).One(1) by decipher for Outstanding shares field
For the Chairman field also i assigned the correct region but when we send a new batch its mapping old one.
3.
Even if we mention the correct keywords in DFD its mapping to other fields.
I tried this with same documents 5-6 times still same issue is happening, not recognizing the values i deleted the entire training data and started fresh but still no luck.
What needs to be done do get the exact values the image quality is also good.
I am i missing anything during training? As i know for training we no need to enable any setting in decipher it will automatically learn if it is a same template.
13-11-23 08:14 AM
Hi Salman,
Have you tried training Decipher without specifying any keywords? This can produce better results where some keywords are resulting in incorrect data being auto-selected. Where some fields are blank this is likely to avoid Decipher looking for any other text in the area.
It looks quite structured, have you considered trying the misc parameter FormFields = On?
Thanks
13-11-23 09:48 AM
Hi Ben,
Initially i tried without specifying any key words only but its not worked so i added keywords still its not worked for me.
In my image the Fields position might shift randomly its not fixed so i have not used the FormFields parameter.
Do you want me to use misc paramter in my case?
Even i selected the flag as multiline and assigned the proper region its not detecting the multiline for the second time onwards.
13-11-23 01:34 PM
Hi Salman,
Tricky, in this case I wouldn't recommend Form Fields.
Are you deleting your training data each time you try a new DFD configuration?
Tesseract OCR is also known to find it harder to read text on a coloured background, maybe try adjusting the image contrast or converting it to greyscale.
If you'd like some "hands on" help, you may be eligible for some time with our professional services team via a Knowledge Support session. Please check with your SS&C Blue Prism account manager for details.
13-11-23 02:03 PM
Hi Ben,
No, I am not deleting the training data whenever i am changing the DFD Configuration.
Decipher is not capable enough to read the text on a colored background?
If i have to adjust the image contrast or converting it to greyscale can i do this setting in decipher for onetime? or i need to do it manually?
Due to this colored background only the values are extracting wrong? like letter I is extracted as 1, letter 'O' is extracted as 0 (Zero), letter a is extracted as @, multiline text is detected only as a single line?
13-11-23 02:16 PM
Hi Salman,
Check out the best practice guidance on how to configure/train your DFD Decipher IDP best practices (blueprism.com), this will help get you up and running as quickly as possible.
It's not Decipher so much as it's the Tesseract OCR engine. Decipher uses Tesseract 5 as its primary OCR engine as its the most comprehensive, freely available OCR engine on the market. Our engineering team carry out a number of activities to refine the performance, but we are still working within the available features of the product.
Image adjustments would be manual, but it might give you an idea on why your results aren't 100%.
The multiline flag is more to create the space in the validation screen (there are other impacts in how the data is processed e.g. how line breaks are stored), you can train multiline fields without that flag being selected. It may be negatively influenced by previous rounds of training where you've made changes to your DFD without restarting the training.
Thanks
13-09-24 09:41 AM
It sounds like the issue might be with:
Revalidate your setup to improve accuracy.