Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Python - getting just the difference between strings

What's the best way of getting just the difference from two multiline strings?

a = 'testing this is working \n testing this is working 1 \n'
b = 'testing this is working \n testing this is working 1 \n testing this is working 2'

diff = difflib.ndiff(a,b)
print ''.join(diff)

This produces:

  t  e  s  t  i  n  g     t  h  i  s     i  s     w  o  r  k  i  n  g     
     t  e  s  t  i  n  g     t  h  i  s     i  s     w  o  r  k  i  n  g     1     
+  + t+ e+ s+ t+ i+ n+ g+  + t+ h+ i+ s+  + i+ s+  + w+ o+ r+ k+ i+ n+ g+  + 2

What's the best way of getting exactly:

testing this is working 2?

Would regex be the solution here?

like image 588
Rekovni Avatar asked Sep 27 '17 16:09


3 Answers

The easiest Hack, credits @Chris, by using split().

Note : you need to determine which is the longer string, and use that for split.

if len(a)>len(b): 
   res=''.join(a.split(b))             #get diff
   res=''.join(b.split(a))             #get diff

print(res.strip())                     #remove whitespace on either sides

# driver values

IN : a = 'testing this is working \n testing this is working 1 \n' 
IN : b = 'testing this is working \n testing this is working 1 \n testing this is working 2'

OUT : testing this is working 2

EDIT : thanks to @ekhumoro for another hack using replace, with no need for any of the join computation required.

if len(a)>len(b): 
    res=a.replace(b,'')             #get diff
    res=b.replace(a,'')             #get diff
like image 128
Kaushik NP Avatar answered Sep 25 '22 18:09

Kaushik NP

a = 'testing this is working \n testing this is working 1 \n'
b = 'testing this is working \n testing this is working 1 \n testing this is working 2'

splitA = set(a.split("\n"))
splitB = set(b.split("\n"))

diff = splitB.difference(splitA)
diff = ", ".join(diff)  # ' testing this is working 2, more things if there were...'

Essentially making each string a set of lines, and taking the set difference - i.e. All things in B that are not in A. Then taking that result and joining it all into one string.

Edit: This is a conveluded way of saying what @ShreyasG said - [x for x if x not in y]...

like image 28
Godron629 Avatar answered Sep 24 '22 18:09


This is basically @Godron629's answer, but since I can't comment, I'm posting it here with a slight modification: changing difference for symmetric_difference so that the order of the sets doesn't matter.

a = 'testing this is working \n testing this is working 1 \n'
b = 'testing this is working \n testing this is working 1 \n testing this is working 2'

splitA = set(a.split("\n"))
splitB = set(b.split("\n"))

diff = splitB.symmetric_difference(splitA)
diff = ", ".join(diff)  # ' testing this is working 2, some more things...'
like image 34
Keith Avatar answered Sep 26 '22 18:09
