Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Edit distance explanation

I have seen a lot of code to solve that but I am not able to see why they are using a matrix to represent the distance between two words. Can any one please explain to me?

Here is a sample code I found:

public static int minDistance(String word1, String word2)
{
    int l1 = word1.length(), l2 = word2.length();

    int[][] d = new int[l1 + 1][l2 + 1];

    // the edit distance between an empty string and the prefixes of
    // word2
    for (int i = 0; i < l2 + 1; i++) {
        d[0][i] = i;
    }

    // the edit distance between an empty string and the prefixes of
    // word1
    for (int j = 0; j < l1 + 1; j++) {
        d[j][0] = j;
    }

    for (int i = 1; i < l1 + 1; i++) {
        for (int j = 1; j < l2 + 1; j++) {
            if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
                d[i][j] = d[i - 1][j - 1];
            } else {
                d[i][j] = min(1 + d[i][j - 1], 1 + d[i - 1][j],
                1 + d[i - 1][j - 1]); // min of insertion,
                // deletion, replacement
            }
        }
    }

    return d[l1][l2];
}
like image 385
Srujan Kumar Gulla Avatar asked Dec 16 '22 16:12

Srujan Kumar Gulla


1 Answers

Your code is calculating the Levenshtein distance using dynamic programming.

The array d will ultimately contain the solution to various sub-problems, where d[i][j] is the distance between the first i letters of the first word and the first j letters of the second. There is a relationship between d[i][j] and the entries d[i-1][j], d[i][j-1] and d[i-1][j-1]. The algorithm calculates the entries of the table in such a way that required sub-problems have already been calculated (this is the dynamic programming part).

like image 76
borrible Avatar answered Jan 19 '23 01:01

borrible