I need to implement a Diff algorithm in VB.NET to find the changes between two different versions of a piece of text. I've had a scout around the web and have found a couple of different algorithms.
Does anybody here know of a 'best' algorithm that I could implement?
Myers Algorithm – human readable diffs This is used by tools such as Git Diff and GNU Diff. Original Myers time and space complexity is O(ND) where N is the sum of the lengths of both inputs and D is the size of the minimum edit script that converts one input to the other.
Git supports 4 diff algorithms Myers, Minimal, Patience, and Histogram. And Myers is used as the default algorithm.
patience: Patience diff and longest increasing subsequence Patience diff computes the difference between two lists, for example the lines of two versions of a source file. It provides a good balance of performance, nice output for humans, and implementation simplicity.
In Git, there are four diff algorithms, namely Myers, Minimal, Patience, and Histogram, which are utilized to obtain the differences of the two same files located in two different commits. The Minimal and the Histogram algorithms are the improved versions of the Myers and the Patience respectively.
Well I've used the c# version on codeproject and its really good for what I wanted...
http://www.codeproject.com/KB/recipes/diffengine.aspx
You can probably get this translated into VB.net via an online converter if you can't do it yourself...
I like An O(ND) Difference Algorithm and Its Variations by Eugene Myers. I believe it's the algorithm that was used in GNU diff. For a good background see Wikipedia.
This is quite theoretical and you might wish to find source code, but I'm not aware of any in VB.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With