Questions Linux Laravel Mysql Ubuntu Git Menu

HTML CSS JAVASCRIPT SQL PYTHON PHP BOOTSTRAP JAVA JQUERY R React Kotlin

Python: How to correct misspelled names

Tags:

python

spell-checking

I have a list with city names, which some of them are misspelled:

['bercelona', 'emstrdam', 'Praga']

And a list with all possible city names well spelled:

['New York', 'Amsterdam', 'Barcelona', 'Berlin', 'Prague']

I'm looking for an algorithm able to find the closest match between the names of the first and second list, and returns the first list with its well spelled names. So it should return the following list:

['Barcelona', 'Amsterdam', 'Prague']

like image

267

asked Dec 16 '16 21:12

ebeneditos

People also ask

Is there a spell check in Python?

Checking of spelling is a basic requirement in any text processing or analysis. The python package pyspellchecker provides us this feature to find the words that may have been mis-spelled and also suggest the possible corrections.

1 Answers

You may use built-in Ratcliff and Obershelp algorithm:

def is_similar(first, second, ratio):
    return difflib.SequenceMatcher(None, first, second).ratio() > ratio


first = ['bercelona', 'emstrdam', 'Praga']
second = ['New York', 'Amsterdam', 'Barcelona', 'Berlin', 'Prague']

result = [s for f in first for s in second if is_similar(f,s, 0.7)]
print result
['Barcelona', 'Amsterdam', 'Prague']

Where 0.7 is coefficient of similarity. It may do some tests for your case and set this value. It shows how similar are both of strings(1 - it's the same string, 0 - very different strings)

like image

102

answered Oct 19 '22 15:10

pivanchy

Sign in to Comment

Related questions
                            
                                TypeError: 'int' object is not iterable - Python
                            
                                How to check if *either* character is in a string in Python? [closed]
                            
                                Generate random sentences in python
                            
                                Efficient algorithm perl or python
                            
                                Is there a better way to check for vowels in the first position of a word?
                            
                                Scrapy gives URLError: <urlopen error timed out>
                            
                                Is python's random number generation easily reproducible?
                            
                                Python getter and setter via @property within SqlAlchemy model class definition: HOWTO
                            
                                Reduce function doesn't handle an empty list
                            
                                scikit cosine_similarity vs pairwise_distances
                            
                                How to check if a string represents a float number
                            
                                How to open external programs in Python
                            
                                Bypass SSL when I'm using SUDS for consume web service
                            
                                Plotting graph using python and dispaying it using HTML
                            
                                How to save numpy array into computer for later use in python
                            
                                Remove spaces between numbers in a string in python
                            
                                Pass kwargs into Django Filter
                            
                                PySpark: add a new field to a data frame Row element
                            
                                Why does [-1] not return the last character of the line in a file?
                            
                                What is pythononic way of slicing a set?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With