Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"diff -u -B -w" in python?

Tags:

python

difflib

Using Python, I'd like to output the difference between two strings as a unified diff (-u) while, optionally, ignoring blank lines (-B) and spaces (-w).

Since the strings were generated internally, I'd prefer to not deal with nuanced complexity of writing one or both strings to a file, running GNU diff, fixing up the output, and finally cleaning up.

While difflib.unified_diff generates unified diffs it doesn't seem to let me tweak how spaces and blank lines are handled. I've looked at its implementation and, I suspect, the only solution is to copy/hack that function's body.

Is there anything better?

For the moment I'm stripping the pad characters using something like:

import difflib
import re
import sys

l = "line 1\nline 2\nline 3\n"
r = "\nline 1\n\nline 2\nline3\n"
strip_spaces = True
strip_blank_lines = True

if strip_spaces:
    l = re.sub(r"[ \t]+", r"", l)
    r = re.sub(r"[ \t]+", r"", r)
if strip_blank_lines:
    l = re.sub(r"^\n", r"", re.sub(r"\n+", r"\n", l))
    r = re.sub(r"^\n", r"", re.sub(r"\n+", r"\n", r))
# run diff
diff = difflib.unified_diff(l.splitlines(keepends=True), r.splitlines(keepends=True))
sys.stdout.writelines(list(diff))

which, of course, results in the output for a diff of something something other than the original input. For instance, pass the above text to GNU diff 3.3 run as "diff -u -w" and "line 3" is displayed as part of the context, the above would display "line3".

like image 481
cagney Avatar asked Jul 31 '15 18:07

cagney


1 Answers

Make Your own SequenceMatcher, copy unified_diff body and replace SequenceMatcher with Your own matcher.

like image 81
Tomasz Jakub Rup Avatar answered Sep 19 '22 05:09

Tomasz Jakub Rup