How to calculate Levenshtein Distance matrix of strings in Python
str1 str2 str3 str4 ... strn
str1 0.8 0.4 0.6 0.1 ... 0.2
str2 0.4 0.7 0.5 0.1 ... 0.1
str3 0.6 0.5 0.6 0.1 ... 0.1
str4 0.1 0.1 0.1 0.5 ... 0.6
. . . . . ... .
. . . . . ... .
. . . . . ... .
strn 0.2 0.1 0.1 0.6 ... 0.7
Using Distance function we can calculate distance between 2 words. But here I have 1 list containing n number of strings. I wanted to calculate the distance matrix and after that I want to do clustering of words.
Here is my code
import pandas as pd
from Levenshtein import distance
import numpy as np
Target = ['Tree','Trip','Treasure','Nothingtodo']
List1 = Target
List2 = Target
Matrix = np.zeros((len(List1),len(List2)),dtype=np.int)
for i in range(0,len(List1)):
for j in range(0,len(List2)):
Matrix[i,j] = distance(List1[i],List2[j])
print Matrix
[[ 0 2 4 11]
[ 2 0 6 10]
[ 4 6 0 11]
[11 10 11 0]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With