Hi,
I want to show a wrong behavior of decipher when working with regular expressions.
I am working with decipher to extract information from ID documents, so I covered the sensitive information in all the screenshots I attach. Here is one <document example:
I want to extract the name of the person (all the words covered by white), so the header of the field is "NOMBRE". In order to avoid Decipher to extract the alphanumeric code covered by blue I wrote this Regex:
([A-ZÁÉÍÓÚÜÑ]+[\n ]+[A-ZÁÉÍÓÚÜÑ]+)([\n ]+[A-ZÁÉÍÓÚÜÑ]+)*
The regex makes decipher extract something that has
two or more words (with all the Spanish characters, but not allowing number), separated by spaces or newlines.
As shown in the first screenshot, decipher has not extracted the second line of the name (it is a multiline field), so I manually reshaped the box of the field. After doing this the validation of the field fails and the box turns red even though
the data should fix the regex.
The way I found to fix this is:
- Click inside the field
- Modify the data inside (e.g. remove a character)
- Click out of the field
- Now the field turns green showing the data format is valid
- Click inside the field and undo the modification I made (introduce the character I deleted)
With these steps
the field now contains the same data by decipher can see its format is valid
------------------------------
Oroel Ipas
------------------------------