Question

Let's say, I have a table objects. It has fields id, name, misc.

How can I find rows with similar or duplicate name values? I can see that MySQL can be used itself for searching duplicate values, but not for similar ones, eg. PHP Hypertext Preprocessor and PHP Hypertext Postprocessor (~90% of source value).

Can it be performed with Sphinx? And how?

Was it helpful?

Solution

I don't know the details of Sphinx, but what you're talking about sounds like calculating Levenshtein Distances. Quickly googling for "sphinx php levenshtein" I found this thread which describes a method that might work for you. Hopefully that gives you something to go on.

OTHER TIPS

The 'suggest' example from sphinx might be useful starting point.

http://code.google.com/p/sphinxsearch/source/browse/trunk/#trunk%2Fmisc%2Fsuggest

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top