I'm looking for a solution to compare two versions of the same file to get a representation of the changes/differences.
Myers Algorithm – human readable diffs The Myers Algorithm belongs to the string correction family and is widely used by tools fine tuned to generate human readable delta/patch files out of human readable inputs. This is used by tools such as Git Diff and GNU Diff.
Myers. Myers algorithm was developed by Myers (1986). In the git diff command, this algorithm is used as the default. The operation of this algorithm traces the two primary identical sequences recursively with the least edited script.
The diff command is invoked from the command line, passing it the names of two files: diff original new . The output of the command represents the changes required to transform the original file into the new file. If original and new are directories, then diff will be run on each file that exists in both directories.
Alternatively referred to as compare, diff is short for different or difference and describes a program's ability to show the difference between two or more files. A diff is an invaluable tool in programming as it enables a developer to see what has changed in-between versions.
If it's plain text, then Google's diff-match-patch library ought to do what you want (it has a C# version).
If it's binary data, then look into the things people do to apply updates to executables (bsdiff and Courgette). They look for the minimum difference between two files so that a smaller update can be sent out to end users. Sounds similar to your needs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With