11-04-23 02:16 PM
Hello,
I need assistance with extracting decimal values from specific pages of Spanish treasury documents that have the following format:
As you can see, each box containing the value to extract has a code in the format 00DDD.
I am encountering issues when the box is empty, as Decipher tries to obtain the closest value that meets the conditions set in the Document Form Definition (DFD). However, I would like it to leave the box empty when there's no value. For example, when attempting to extract the 00102 field from the second example, it reads 696,518.11. In some cases, Decipher even tries to find values on other pages with boxes containing similar codes.
The fields are configured as follows:
Despite correcting several documents in Data Verification, the problem persists with each new batch.
I would greatly appreciate any help or guidance on how to configure the DFD for documents like these.
Thank you in advance.
12-04-23 03:55 PM
Hi Abel,
Seems like quite a tricky situation.
Have you tried setting the field type to Money? Did it perform any differently? I'd recommend trying it without the Format Expression.
I appreciate you've already set StrictPosition, but as it looks like a form have you tried using a single instance of FormField=On? (remove all the StrictPosition entries)
Thanks
Ben
13-04-23 08:15 AM
UPDATE: I followed Ben Lyons' guidance and updated the Document Form Definition. I then uploaded a new batch of three documents. However, the issue persists for the empty fields, as shown in the "01 - Original Data Verification.JPG" attachment.
To resolve this, I selected the correct regions for each empty field, as shown in the "02 - Corrected Data Verification.JPG" and "02 - Corrected Data Verification2.JPG" attachments.
I then loaded a second batch of three documents, but unfortunately, the issue still persists for the empty fields, as shown in the "03 - Second Batch Errors.JPG" attachment.
Is there anything else I could try?
Thank you.
13-04-23 09:20 AM
Hi Abel,
You may need to delete your training data as it will still contain your previous document training and may be incorrectly influencing your new configuration.
Thanks
Ben
14-04-23 10:26 AM
Hi Ben,
I just want to confirm that I'm deleting the training data correctly, as I'm new to this tool:
I deleted the Capture Model that was assigned to the Document Type in the Admin Panel -> System -> Capture Models. After that, I created a new ML Model from the Document Type edit page and configured it as shown in the image below:
14-04-23 11:50 AM
Hi Abel,
That's not what I'm referring to, the ML Capture model is a separate bit of training. I don't generally recommend you create those so early on, Decipher has a primary ML model it builds without configuration. This model is the Training Data you can access some controls to in Admin Panel > Training Data. Be mindful that clicking the delete button will delete the training for the whole environment.
I recommend running through our best practice guide here to walk through some of these concepts. And if you haven't already, the Decipher learning course on BP University will also help familiarise you with many of the fundamentals.
Thanks
Ben