I am trying to show where the two HTML pages differ. I am trying to figure out a way if i can compare the HTML source code of two webpages(almost similar), and show/highlight the differences visually(on UI).
What I tried: I thought of taking snapshot of the page and then use Resemble.js to compare two images. But that shows very minute differences as well and results are something which is not clear.
I thought of comparing the DOM structure or the source code and then show what or where actually the two pages differ on UI.
Is there any way i could achieve this? I am using Selenium- Webdriver to get the snapshots and the HTML source code.
EDIT:
I guess my question was not clear. Actually, i wanted to find out the difference in HTML content for webpages in order to detect A/B tests being performed currently. I first grabbed the html source into a text file and then compared it with previously captured HTML source using Java-Diff util . This gave me the actual lines which differ in two text files with HTML source.
Now, the problem is, how can i show this difference on UI as in highlighting the areas which i found are different? Hope this would make it more clear.
The below code shows the lines which differ
List<String> original = fileToLines("HTML Source diff/originalSource.txt");
List<String> revised = fileToLines("HTML Source diff/sourceAfterCookieClear.txt");
// Compute diff. Get the Patch object. Patch is the container for computed deltas.
Patch patch = DiffUtils.diff(original, revised);
System.out.println("Printing Deltas\n");
for (Delta delta : patch.getDeltas()) {
String revisedText = delta.getRevised().toString();
String content = revisedText.substring(revisedText.indexOf(" [")+2,revisedText.indexOf("]]"));
writeTextToFile(content,"difference.html");
}
Any leads in form of code would be helpful.
To compare HTML s and verify how our Java library works, simply load the files you want to diff and select the export file format. After comparing two files, the document containing the difference of this comparison will be automatically loaded.
If you want to compare files one by one manually: You need to use diff command to display line-by-line difference between two files. You can use --changed-group-format and --unchanged-group-format options to filter required data.
Use python's difflib. For example:
import difflib
file1 = open('file1.html', 'r').readlines()
file2 = open('file2.html', 'r').readlines()
htmlDiffer = difflib.HtmlDiff()
htmldiffs = htmlDiffer.make_file(file1, file2)
with open('comparison.html', 'w') as outfile:
outfile.write(htmldiffs)
This will create an html file named comparison.html
containing the diffs between the two html files file1.html
and file2.html
. Here file1.html
is considered the source, or original version whichever is more appropriate for your case, and file2.html
is the changed version or new version, again, whichever is more appropriate here.
Hope that helps!
Use daisyDiff api http://code.google.com/p/daisydiff/ You can call this api from a command prompt after your java code returns a difference.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With