I know that I can use cmp, diff, etc to compare two files, but what I am looking for is a utility that gives me percentage difference between two files.
if there is no such utility, any algorithm would do fine too. I have read about fuzzy programming, but I have not quite understand it.
Using File Compare or the FC command in Command Prompt is another way if you need text or binary compare. The output is shown in Command Prompt and is not easy to read. For all file formats that Word can open, the Compare option in Word is the easiest to use.
Notepad++ compare two filesTo compare two files in Notepad++, first open both of them in the application. Then, go to the Plugins menu and select Compare → Compare. This will open up a new window showing the two files side by side, with any differences highlighted.
On the File menu, click Compare Files. In the Select First File dialog box, locate and then click a file name for the first file in the comparison, and then click Open. In the Select Second File dialog box, locate and then click a file name for the second file in the comparison, and then click Open.
You can use difflib.SequenceMatcher ratio method
From the documentation:
Return a measure of the sequences’ similarity as a float in the range [0, 1].
For example:
from difflib import SequenceMatcher
text1 = open(file1).read()
text2 = open(file2).read()
m = SequenceMatcher(None, text1, text2)
m.ratio()
It looks like Linux has a utility called dwdiff that can give percentage differences by using the "-s" flag
http://www.softpanorama.org/Utilities/diff_tools.shtml
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With