Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

percentage difference between two text files

I know that I can use cmp, diff, etc to compare two files, but what I am looking for is a utility that gives me percentage difference between two files.

if there is no such utility, any algorithm would do fine too. I have read about fuzzy programming, but I have not quite understand it.

like image 210
Mohamed Avatar asked Aug 26 '09 13:08

Mohamed


People also ask

How do you compare differences in text files?

Using File Compare or the FC command in Command Prompt is another way if you need text or binary compare. The output is shown in Command Prompt and is not easy to read. For all file formats that Word can open, the Compare option in Word is the easiest to use.

Can Notepad++ compare files?

Notepad++ compare two filesTo compare two files in Notepad++, first open both of them in the application. Then, go to the Plugins menu and select Compare → Compare. This will open up a new window showing the two files side by side, with any differences highlighted.

How do I compare two text files in Windows?

On the File menu, click Compare Files. In the Select First File dialog box, locate and then click a file name for the first file in the comparison, and then click Open. In the Select Second File dialog box, locate and then click a file name for the second file in the comparison, and then click Open.


2 Answers

You can use difflib.SequenceMatcher ratio method

From the documentation:

Return a measure of the sequences’ similarity as a float in the range [0, 1].

For example:

from difflib import SequenceMatcher
text1 = open(file1).read()
text2 = open(file2).read()
m = SequenceMatcher(None, text1, text2)
m.ratio()
like image 198
Nadia Alramli Avatar answered Oct 12 '22 21:10

Nadia Alramli


It looks like Linux has a utility called dwdiff that can give percentage differences by using the "-s" flag

http://www.softpanorama.org/Utilities/diff_tools.shtml

like image 29
brien Avatar answered Oct 12 '22 21:10

brien