cancel
Showing results for 
Search instead for 
Did you mean: 

Blueprism Language Capability

RajeshRamu
Level 2
Can Blueprism recognize the language used in a given PDF using the tessesaract language package or is it only used to read the languages more efficiently? For Ex: if an input PDF contains 4 different languages and all the language packages are imported. Can BP identify the languages or it can only read and give the output efficiently?
2 REPLIES 2

Denis__Dennehy
Level 15
You could theoretically have 4 languange packs installed and use them all as part of the same screen read - there is an input parameter to the read text as ocr navagate action to say which language to use. I would caviat anything on this topic by saying that no OCR technology from any vendor is 100% reliable and if you require full accuracy a true document reading and verification tool needs to be used alongside Blue Prism. Your Blue Prism Partner or Account Manager would be able to discuss some of those tool options with you - they do not just OCR text, they contain document training features, OCR, a verification step where manual users eyeball the read text to ensure it has been OCR'd accurately, and they conver the OCR'd text into a structured format that can be used by Blue Prism.

Thank You Denis. But is this possible ? With all the language packages available, will blueprism be able to detect what language is used in the PDF? For ex: if four language packages are installed (English , French, Dutch , Spanish) , and if the input PDF has one of the language in it (ex :Spanish), can blueprism read the PDF and recognize the language as Spanish?