I have seen a lot of code to solve that but I am not able to see why they are using a matrix to represent the distance between two words. Can any one please explain to me?
Here is a sample code I found:
public static int minDistance(String word1, String word2)
{
int l1 = word1.length(), l2 = word2.length();
int[][] d = new int[l1 + 1][l2 + 1];
// the edit distance between an empty string and the prefixes of
// word2
for (int i = 0; i < l2 + 1; i++) {
d[0][i] = i;
}
// the edit distance between an empty string and the prefixes of
// word1
for (int j = 0; j < l1 + 1; j++) {
d[j][0] = j;
}
for (int i = 1; i < l1 + 1; i++) {
for (int j = 1; j < l2 + 1; j++) {
if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
d[i][j] = d[i - 1][j - 1];
} else {
d[i][j] = min(1 + d[i][j - 1], 1 + d[i - 1][j],
1 + d[i - 1][j - 1]); // min of insertion,
// deletion, replacement
}
}
}
return d[l1][l2];
}
Your code is calculating the Levenshtein distance using dynamic programming.
The array d
will ultimately contain the solution to various sub-problems, where d[i][j]
is the distance between the first i
letters of the first word and the first j
letters of the second. There is a relationship between d[i][j]
and the entries d[i-1][j]
, d[i][j-1]
and d[i-1][j-1]
. The algorithm calculates the entries of the table in such a way that required sub-problems have already been calculated (this is the dynamic programming part).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With