Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python difflib, get offsets

Tags:

python

Is there a way in python with difflib to get offsets of the changes as well as the changes themselves?

What I have is the following:

import difflib

text1 = 'this is a sample text'.split()
text2 = 'this is text two.'.split()

print list(difflib.ndiff(text1, text2))

which prints:

['  this', '  is', '- a', '- sample', '  text', '+ two.']

Can I also get offsets of the corresponding changes? Naive solution would be just to search for changes, but if strings get longer with repeated terms, that wouldn't work.

like image 929
CentAu Avatar asked Jun 25 '26 19:06

CentAu


1 Answers

SequenceMatcher.get_matching_blocks() might help. It returns a list of triples describing matching subsequences. These indices in turn could be used to find the location of differences.

>>> for block in s.get_matching_blocks():
...     print "a[%d] and b[%d] match for %d elements" % block
a[0] and b[0] match for 8 elements
a[8] and b[17] match for 21 elements
a[29] and b[38] match for 0 elements

https://docs.python.org/2/library/difflib.html#difflib.SequenceMatcher.get_matching_blocks https://docs.python.org/2/library/difflib.html#sequencematcher-examples

like image 196
matthewatabet Avatar answered Jun 27 '26 08:06

matthewatabet



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!