05-03-24 01:00 PM
I am working on a use case that will use decipher for document classification only, and not for data extraction. I've not used Decipher before and going through the training material, it all seems to assume that you will be extracting data.
To get started with classification only, do I need to created DFDs? If so, do they need to contain the fields that appear on the form?
If not, how do I get started training the classification model without first defining DFDs?
Secondly, we will be working with multiple documents. Each document has multiple variants, each of which will appear in at least two languages. Is one classification model appropriate for this situation or would it be best to use multiple?
06-03-24 08:11 AM
Hi Felix,
In order for a batch to be exported and the data to be available in the VBO, it needs to be attached to a DFD. However you could just create a single field, de-select assignable and set the default field value to the name of the Document Type.
Classification models can be trained with hundreds of variants for the same Document Type, but you would need to train it with the known variants. This can be updated later on with further variants if you set the model to extensible. More info on this can be found here.
Thanks
07-03-24 10:05 AM
Thanks Ben, that gave me some info I couldn't find in the documentation and training, so super useful. One followup question: Would a DFD be required for each document variant in this case or only for each main document type?
08-03-24 08:22 AM
Hi Felix,
You'll only need one DFD per Document Type. This is generally the case with Decipher as it can extract data from multiple variants using the same configuration. In this use case even more so as you're not currently looking to extract any data from the document.
Thanks
