Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Diff algorithms

Can somebody recommend some papers (literature) or code snippets about tree-based diff algorithms for XML (based on the DOM-tree)

Thank you very much.

like image 590
machinery Avatar asked Sep 21 '12 15:09

machinery


People also ask

What algorithm does git diff use?

In Git, there are four diff algorithms, namely Myers, Minimal, Patience, and Histogram, which are utilized to obtain the differences of the two same files located in two different commits. The Minimal and the Histogram algorithms are the improved versions of the Myers and the Patience respectively.

What is diff used for?

Typically, diff is used to show the changes between two versions of the same file. Modern implementations also support binary files. The output is called a "diff", or a patch, since the output can be applied with the Unix program patch.

What is a code diff?

The diff utility is a data comparison tool that calculates and displays the differences between two files. It displays the changes made in a standard format, such that both humans and machines can understand the changes and apply them: given one file and the changes, the other file can be created.

Does git use diff?

Diff command is used in git to track the difference between the changes made on a file. Since Git is a version control system, tracking changes are something very vital to it. Diff command takes two inputs and reflects the differences between them. It is not necessary that these inputs are files only.


1 Answers

Here is one useful paper on the same : http://pdf.aminer.org/000/301/327/x_diff_an_effective_change_detection_algorithm_for_xml_documents.pdf

Here is one tool you can experiment with: http://www.cs.hut.fi/~ctl/3dm/

You may find the Java source for the above tool as well which maybe of great use.

like image 102
Abhishek Jain Avatar answered Oct 14 '22 08:10

Abhishek Jain