How to use SequenceMatcher to find similarity between two strings?

Tags:

python

difflib

import difflib

a='abcd'
b='ab123'
seq=difflib.SequenceMatcher(a=a.lower(),b=b.lower())
seq=difflib.SequenceMatcher(a,b)
d=seq.ratio()*100
print d

I used the above code but obtained output is 0.0. How can I get a valid answer?

754

asked Jan 26 '11 07:01

joolie

2 Answers

You forgot the first parameter to SequenceMatcher.

>>> import difflib
>>> 
>>> a='abcd'
>>> b='ab123'
>>> seq=difflib.SequenceMatcher(None, a,b)
>>> d=seq.ratio()*100
>>> print d
44.4444444444

http://docs.python.org/library/difflib.html

120

answered Oct 06 '22 06:10

Lennart Regebro

From the docs:

The SequenceMatcher class has this constructor:

class difflib.SequenceMatcher(isjunk=None, a='', b='', autojunk=True)

The problem in your code is that by doing

seq=difflib.SequenceMatcher(a,b)

you are passing a as value for isjunk and b as value for a, leaving the default '' value for b. This results in a ratio of 0.0.

One way to overcome this (already mentioned by Lennart) is to explicitly pass None as extra first parameter so all the keyword arguments get assigned the correct values.

However I just found, and wanted to mention another solution, that doesn't touch the isjunk argument but uses the set_seqs() method to specify the different sequences.

>>> import difflib
>>> a = 'abcd'
>>> b = 'ab123'
>>> seq = difflib.SequenceMatcher()
>>> seq.set_seqs(a.lower(), b.lower())
>>> d = seq.ratio()*100
>>> print d
44.44444444444444

answered Oct 06 '22 08:10

Tim

Related questions
                            
                                Python/Matplotlib - Change the relative size of a subplot
                            
                                Assignment Condition in Python While Loop
                            
                                Write a raw binary file with NumPy array data
                            
                                Flask - How to create custom abort() code?
                            
                                How can I copy an immutable object like tuple in Python?
                            
                                How to convert numbers to alphabet? [duplicate]
                            
                                How to scale Seaborn's y-axis with a bar plot
                            
                                How to define a mathematical function in SymPy?
                            
                                How do I set a default, max and min value for an integerfield Django?
                            
                                Django Delete all but last five of queryset
                            
                                Removing non-breaking spaces from strings using Python
                            
                                In Django, can you add a method to querysets?
                            
                                Python implementation of the Wilson Score Interval?
                            
                                Is there an equivalent of Python's `pass` in c++ std11?
                            
                                How to annotate a generator in python3?
                            
                                "unpacking" a passed dictionary into the function's name space in Python?
                            
                                What is the pythonic way to read CSV file data as rows of namedtuples?
                            
                                How to check the number of currently running threads in Python? [duplicate]
                            
                                Django: relation "django_site" does not exist
                            
                                How to pip install a local python package?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With