Best way to read text reliably

ChrisRider · ‎11-09-23

I have a Win32 application that has a table with text in it. One of the columns has what I will call the name I am looking for. My current process can navigate the table, uses OCR (Tesseract) to read the name into a data item. However, OCR might be about 80% accurate?

I am using this table to find the specific row in the table so I can double-click on it to take action. However, since it is not accurate, it often will not match what I am looking for.

I tried to experiment with some alternative solutions. One idea I had was to see if I could manually navigate the table and press ctrl-c to copy the name into a buffer so I could paste it. The Win32 application does not support this.

Any advice on how to read text? I have disabled font-smoothing, think I am using the correct font for OCR, etc. It still does not work good enough.

In the past, I have made what I call a "fudge factor" table. If I have repeatable problems, my table could translate. As an example, suppose OCR read the text "Welcome" as "VVe1come". My fudge table would have the bad value. So if I were looking for Welcom, it would know to expect the fudged value.

For this application, there are thousands of variations and they do not seem repeatable. I am not sure my cludge solution would be possible.

------------------------------
Chris Rider
lead application analyst
E. W. Scripps
cincinnati OH
------------------------------

Denis__Dennehy · ‎11-09-23

Hi Chris,

If the OCR is not reading the value reliably every time I recommend using the tried and tested font recognition functionality within the BP product. If you have not already done the Surface Automation training then I recommend seeking it out, it will give you all the information you should need to read test.
Unlike OCR the Font Recognition functionality reads the pixels that make up the text and compares them to stored font files - so the result will either be a match or will not, but it will never have a variance or unpredictability that you might get with some OCR.

View answer in original post

Denis__Dennehy · ‎11-09-23

Hi Chris,

If the OCR is not reading the value reliably every time I recommend using the tried and tested font recognition functionality within the BP product. If you have not already done the Surface Automation training then I recommend seeking it out, it will give you all the information you should need to read test.
Unlike OCR the Font Recognition functionality reads the pixels that make up the text and compares them to stored font files - so the result will either be a match or will not, but it will never have a variance or unpredictability that you might get with some OCR.

ChrisRider · ‎12-09-23

Hi Denis - thank you for the reply. I am currently looking into this - excited to see it working. This would really improve a few of my current automations and prevent a lot of kludge work.

------------------------------
Chris Rider
lead application analyst
E. W. Scripps
cincinnati OH
------------------------------

SS&C Blue Prism Community

Best way to read text reliably