Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Generate pretty diff html in Python

I have two chunks of text that I would like to compare and see which words/lines have been added/removed/modified in Python (similar to a Wiki's Diff Output).

I have tried difflib.HtmlDiff but it's output is less than pretty.

Is there a way in Python (or external library) that would generate clean looking HTML of the diff of two sets of text chunks? (not just line level, but also word/character modifications within a line)

like image 472
The Unknown Avatar asked Oct 16 '09 06:10

The Unknown


3 Answers

There's diff_prettyHtml() in the diff-match-patch library from Google.

like image 164
tonfa Avatar answered Oct 23 '22 16:10

tonfa


Generally, if you want some HTML to render in a prettier way, you do it by adding CSS.

For instance, if you generate the HTML like this:

import difflib
import sys

fromfile = "xxx"
tofile = "zzz"
fromlines = open(fromfile, 'U').readlines()
tolines = open(tofile, 'U').readlines()

diff = difflib.HtmlDiff().make_file(fromlines,tolines,fromfile,tofile)

sys.stdout.writelines(diff)

then you get green backgrounds on added lines, yellow on changed lines and red on deleted. If I were doing this I would take take the generated HTML, extract the body, and prefix it with my own handwritten block of HTML with lots of CSS to make it look good. I'd also probably strip out the legend table and move it to the top or put it in a div so that CSS can do that.

Actually, I would give serious consideration to just fixing up the difflib module (which is written in python) to generate better HTML and contribute it back to the project. If you have a CSS expert to help you or are one yourself, please consider doing this.

like image 25
Michael Dillon Avatar answered Oct 23 '22 14:10

Michael Dillon


I recently posted a python script that does just this: diff2HtmlCompare (follow the link for a screenshot). Under the hood it wraps difflib and uses pygments for syntax highlighting.

like image 6
wagoodman Avatar answered Oct 23 '22 15:10

wagoodman