How to Calculate Levenshtein distance between two .txt files? [closed]

Question

Is there a standard linux command for it? If not, can anyone describe a python script to do the same?

Luxusproblem · Accepted Answer

It depends. When the ocr Outputs are similar and there are one few differences to expect, yout could do a "split" and compare each word/line etc. And only use levenshtein distance for the part in wich diferences occur when the amount of lines are the same. eg:

def textLevi(txt1,txt2):
   lines = list(zip(txt1.split("
"),txt2.split("
")))
   distance = 0
   for i,ele in enumerate(lines,1):
        line1,line2 = ele
       if line1 != line2:
           actDistance = distance(line1,line2)
           print( "Distance of line %d: " %(i),actDistance)
           distance += actDistance


   print( "Sum of Lv Distances:",distance)
 
textLevi("Hello I 
 like cheese","Hello I 
 like cheddar")

would create the Output:

Distance of line 2: 4

Sum of Lv Distances: 4

How to Calculate Levenshtein distance between two .txt files? [closed]

Tags:

python

linux

levenshtein-distance

blastoise

1 Answers

Luxusproblem

Recent Activity

Donate For Us

How to Calculate Levenshtein distance between two .txt files? [closed]

Tags:

python

linux

levenshtein-distance

blastoise

1 Answers

Luxusproblem

Related questions

Recent Activity

Donate For Us