Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to using python to diff two html files

Tags:

python

html

diff

i want use python to diff two html files:

example :

html_1 = """
<p>i love it</p>
"""
html_2 = """ 
<h2>i love it </p>
"""

the diff file will like this :

diff_html = """
<del><p>i love it</p></dev><ins><h2>i love it</h2></ins>
"""

is there such python lib help me do this ?

like image 767
mike Avatar asked Mar 05 '12 05:03

mike


1 Answers

You could use difflib.ndiff() to look for and replace the "-"/"+" with your desired HTML.

import difflib

html_1 = """
<p>i love it</p>
"""
html_2 = """
<h2>i love it </p>
"""

diff_html = ""
theDiffs = difflib.ndiff(html_1.splitlines(), html_2.splitlines())
for eachDiff in theDiffs:
    if (eachDiff[0] == "-"):
        diff_html += "<del>%s</del>" % eachDiff[1:].strip()
    elif (eachDiff[0] == "+"):
        diff_html += "<ins>%s</ins>" % eachDiff[1:].strip()

print diff_html

The result:

<del><p>i love it</p></del><ins><h2>i love it </p></ins>
like image 61
Nate Avatar answered Oct 11 '22 03:10

Nate