cancel
Showing results for 
Search instead for 
Did you mean: 

Reading grayed-out text from remote client screen

PrateekDwivedi
Level 2
We have been making a solution for automating a remote application which we access through Citrix. As we are using Surface Automation, our only options to read text from this screen are Read text with OCR and Recognize Text. With Recognize Text we have to switch off font smoothing in order to be able to spy the font or make our own font(this is as per my current understanding). As we do not have privileges to change smoothing setting on the remote client, we cannot use Recognize Text. This leaves us with Read text with OCR. Now the data we are reading is actually grayed-out in the application and although Read text with OCR reads this data perfectly correctly 8/10 times, the other 2 times it just goes crazy and reads very weirdly formatted or no data at all. Is there some way to contain this erratic behavior of Read text with OCR? I did read somewhere that the reliability of Read text with OCR is not that great but does anyone have any suggestion as to how can I improve the recognition rate? Any help will be greatly appreciated.
1 REPLY 1

Denis__Dennehy
Level 15
To improve the reliability of the Tesseract OCR I can recommend trying the following: ** Experiment with the Scale option, try various values between 4 and 10. ** Use whitelist characters where possible. For example, if you are reading a decimal value field your whitelist should be set to ""0123456789."" ** If colours seem to be impacting your results, have you tried converting your image to B&W before performing OCR on it? ** If you are reading text for some kind of comparison rather than for data entry elsewhere than using some kind of Approximate Matching algorithm to compare is possible, this will reduce the need for 100% accuracy in what you read using OCR. The true answer to your issue is to have font smoothing turned off, as Recognise Text is reliable in reading known fonts, OCR can only ever be used in some use cases after extensive testing to ensure accuracy.