Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

getting line-numbers that were changed

Tags:

python

Given two text files A,B, what is an easy way to get the line numbers of lines in B not present in A? I see there's difflib, but don't see an interface for retrieving line numbers

like image 418
Yaroslav Bulatov Avatar asked Feb 29 '12 20:02

Yaroslav Bulatov


2 Answers

difflib can give you what you need. Assume:

a.txt

this 
is 
a 
bunch 
of 
lines

b.txt

this 
is 
a 
different
bunch 
of 
other
lines

code like this:

import difflib

fileA = open("a.txt", "rt").readlines()
fileB = open("b.txt", "rt").readlines()

d = difflib.Differ()
diffs = d.compare(fileA, fileB)
lineNum = 0

for line in diffs:
   # split off the code
   code = line[:2]
   # if the  line is in both files or just b, increment the line number.
   if code in ("  ", "+ "):
      lineNum += 1
   # if this line is only in b, print the line number and the text on the line
   if code == "+ ":
      print "%d: %s" % (lineNum, line[2:].strip())

gives output like:

bgporter@varese ~/temp:python diffy.py 
4: different
7: other

You'll also want to look at the difflib code "? " and see how you want to handle that one.

(also, in real code you'd want to use context managers to make sure the files get closed, etc etc etc)

like image 148
bgporter Avatar answered Sep 30 '22 13:09

bgporter


A poor man's solution:

with open('A.txt') as f:
    linesA = f.readlines()

with open('B.txt') as f:
    linesB = f.readlines()

print [k for k, v in enumerate(linesB) if not v in linesA]
like image 32
Prashant Kumar Avatar answered Sep 30 '22 13:09

Prashant Kumar