Question

my mysql table has a cityname column. the values in the cityname column has some spelling mistakes. each of the spelling mistakes of city needs to be corrected. another table has a column that contains all correct city names. based on this table the spelling mistakes of the first table are needed to be corrected. i researched over stuff like soundex but couldn't find any link which has done something similar.

TableA 

+----+------------+----------+
| id | col1      | city_name   |
+----+------------+----------+


TableB

+----+------------+--
| index |City_name      |
+----+------------+--

The approach i am having in the mind is that getting a sql query that creates seperate table out of the rows which have similar sounding city names.

once this is done replacing wrong spellings with correct ones again using an sql query

and finally combining all different tables into one table with all corrected spellings.

i am looking for advice both in approach as well as mysql query syntax

Was it helpful?

Solution

There is going to be some manual work involved, and building a front end for it may not be the trouble if this is a one time thing.

What I would do is the following:

  1. Generate a list of all misspellings.
  2. Generate suggestions based on the soundex
  3. Manually go over the list, manually selecting the right one, and run an update statement manually per fix.

So, how to do this:

SELECT * FROM TableA as orig
LEFT OUTER JOIN
TableB as correct
ON SOUNDEX(orig.city_name) = SOUNDEX(correct.city_name)
WHERE orig.City_name NOT IN (SELECT City_name FROM TableB)

and write the update statements by hand. You might get no suggestions through the soundex, or get multiple selections, which you are going to have to resolve yourself. Computers just aren't that smart.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top