cancel
Showing results for 
Search instead for 
Did you mean: 

Fuzzy Matching

Hello! Has anyone been able to try the concept of 'Fuzzy Matching' ( https://en.wikipedia.org/wiki/Fuzzy_matching_(computer-assisted_translation) ) by using Blue Prism? I was thinking of creating a VBO specifically for this but find it hard to know where to begin. I read this concept could be used as suggested by Blue Prism's "Increasing Data Quality" data sheet (see documentation). Here's the exact info given in the pdf: Fuzzy match translation Data can be corrected or translated using €œfuzzy match€ requirements, provided there is a clear rule for doing so. For example, location names can be corrected using techniques borrowed from spell-checking algorithms to identify the closest match from a €œdictionary€ of approved values. For example, using a €œLevenstein Distance€ calculation the following corrections might be made: ï‚· €œSollihull hospital€ becomes €œSolihull Hospital€ (corrected the double €œL€ and the capitalisation of €œH€) ï‚· €œBlue Prism€ becomes €œBlue Prism Limited€ So my question is: has anyone done something like this before? Is this done by writing our own Visual Basic code, using a VBO, using a separate program, ... ? Thanks for any info related to this subject! Sébastien
6 REPLIES 6

John__Carter
Staff
Staff
Hi Sebastien - there is no official VBO available but maybe someone out there has tried it. I would imagine a single code sage would be enough, and searching for '.Net Levenstein Distance' offers many examples. These two look like they will paste straight in, with minimal adjustment. https://social.technet.microsoft.com/wiki/contents/articles/28961.leven… https://www.programmingalgorithms.com/algorithm/levenshtein-distance?la…

I've been able to create a vbo for it, thanks 😉 I've used the levenstein distance as well as the jaro-winkler ratio, both will prove useful for my OCR needs I believe.

John__Carter
Staff
Staff
Very good. Just bear in mind that all OCR and fuzzy matching is basically a guess that can be wrong.

MahmudBarrak
Level 2
Hello Sébastien, I'm currently facing the same challenge as yours. Could you please share the VBO you have created ? Thanks a lot !

BenKirimlidis
Level 7
late to the party but lehvenstein distance functions can be useful here for finding matches where the difference between the target string and the input are similar but different

BenKirimlidis
Level 7
i thought it was 2018...very late to the party