Question

What is the best Fuzzy Matching Algorithm (Fuzzy Logic, N-Gram, Levenstein, Soundex ....,) to process more than 100000 records in less time?

Was it helpful?

Solution

I suggest you read the articles by Navarro mentioned in the Refences section of the Wikipedia article titled Approximate string matching. Making your decision based on actual research is always better than on suggestions by random strangers.. Especially if performance on a known set of records is important to you.

OTHER TIPS

It massively depends on your data. Certain records can be matched better than others. For example postcode is a defined format so can be compared in a different way to normal strings. People can be matched on initials and DOB, or other combinations etc.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top