Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use Python's difflib to produce side-by-side comparison of two files similar to Unix sdiff command?

I am using Python 2.6 and I want to create a simple GUI with two side-by-side text panes comparing two text files (file1.txt & file2.txt) .

I am using difflib but it is not clear for me how to produce a result similar to the sdiff Unix command.

In order to reproduce a side-by-side comparison, I need difflib to return two variables file1_diff and file2_diff, for instance.

I have also considered to use sdiff output directly and parse it to separate the panes but it turned out not to be as easy as it seems... Any hints?

like image 303
zml Avatar asked Jun 24 '15 09:06

zml


People also ask

How to compare two files side-by-side in unix?

sdiff command in linux is used to compare two files and then writes the results to standard output in a side-by-side format. It displays each line of the two files with a series of spaces between them if the lines are identical.

Which command performs side-by-side comparison of two files in Linux?

On Unix-like operating systems, the sdiff command compares two files side-by-side, optionally merges them interactively, and outputs the results.


1 Answers

You can use difflib.Differ to return a single sequence of lines with a marker at the start of each line which describes the line. The markers tell you the following information about the line:

Marker Description
'- ' line unique to file 1
'+ ' line unique to file 2
' ' line common to both files
'? ' line not present in either input files

You can use this information to decide how to display the data. For example, if the marker is , you put the line both in the left and right widgets. If it's + , you could put a blank line on the left and the actual line on the right showing that the line is unique to the text on the right. Likewise, - means the line is unique to the left.

For example, you can create two text widgets t1 and t2, one for the left and one for the right. You can compare two files by creating a list of lines for each and then passing them to the compare method of the differ and then iterating over the results.

t1 = tk.Text(...)
t2 = tk.Text(...)

f1 = open("file1.txt", "r").readlines()
f2 = open("file2.txt", "r").readlines()

differ = difflib.Differ()
for line in differ.compare(f1, f2):
    marker = line[0]
    if marker == " ":
        # line is same in both
        t1.insert("end", line[2:])
        t2.insert("end", line[2:])

    elif marker == "-":
        # line is only on the left
        t1.insert("end", line[2:])
        t2.insert("end", "\n")

    elif marker == "+":
        # line is only on the right
        t1.insert("end", "\n")
        t2.insert("end", line[2:])

The above code ignores lines with the marker ? since those are extra lines that attempt to bring attention to the different characters on the previous line and aren't actually part of either file. You could use that information to highlight the individual characters if you wish.

like image 174
Bryan Oakley Avatar answered Sep 29 '22 11:09

Bryan Oakley