Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spell checking city names?

I figure this problem is easier than just a regular spell checker since the list of U.S cities is small compared to all known English words.

Anyhow, here's the problem: I have text files with full of city names; some of which are spelled correctly and some which aren't.

What kind of algorithm can I use to correct all the misspellings of city names?

like image 670
Esteban Araya Avatar asked Nov 05 '08 03:11

Esteban Araya


1 Answers

Do you actually need to correct the misspellings or just flag them as with a normal spell checker? If the latter, you just need to obtain a list of correct spellings and make sure each name is the same as one in your list.

If you want to actually correct them, you probably want to use the concept of edit distance to compare the similarity of misspelled strings to those in your reference list. Then you can replace the misspelled word with the closest match. You may also want to handle the possibility that the intended city is not in your list.

The Levenshtein distance Wikipedia article is another good resource.

like image 162
108 Avatar answered Oct 23 '22 03:10

108