Similarity Score - Levenshtein

Tags:

I implemented the Levenshtein algorithm in Java and am now getting the corrections made by the algorithm, a.k.a. the cost. This does help a little but not much since I want the results as a percentage.

So I want to know how to calculate those similarity points.

I would also like to know how you people do it and why.

537

asked May 22 '11 10:05

N00programmer

4 Answers

The Levenshtein distance between two strings is defined as the minimum number of edits needed to transform one string into the other, with the allowable edit operations being insertion, deletion, or substitution of a single character. (Wikipedia)

So a Levenshtein distance of 0 means: both strings are equal
The maximum Levenshtein distance (all chars are different) is max(string1.length, string2.length)

So if you need a percentage, you have to use this to points to scale. For example:

"Hallo", "Hello" -> Levenstein distance 1 Max Levenstein distance for this two strings is: 5. So the 20% of the characters do not match.

String s1 = "Hallo";
String s2 = "Hello";
int lfd = calculateLevensteinDistance(s1, s2);
double ratio = ((double) lfd) / (Math.max(s1.length, s2.length));

138

answered Oct 22 '22 02:10

Ralph

You can download Apache Commons StringUtils and investigate (and maybe use) their implementation of Levenshtein distance algorithm.

answered Oct 22 '22 01:10

Roman

 // Refer This: 100% working

public class demo 
{
public static void main(String[] args) 
{
    String str1, str2;

    str1="12345";
    str2="122345";


    int re=pecentageOfTextMatch(str1, str2);
    System.out.println("Matching Percent"+re);
}

public static int pecentageOfTextMatch(String s0, String s1) 
{                       // Trim and remove duplicate spaces
    int percentage = 0;
    s0 = s0.trim().replaceAll("\\s+", " ");
    s1 = s1.trim().replaceAll("\\s+", " ");
    percentage=(int) (100 - (float) LevenshteinDistance(s0, s1) * 100 / (float) (s0.length() + s1.length()));
    return percentage;
}

public static int LevenshteinDistance(String s0, String s1) {

    int len0 = s0.length() + 1;
    int len1 = s1.length() + 1;  
    // the array of distances
    int[] cost = new int[len0];
    int[] newcost = new int[len0];

    // initial cost of skipping prefix in String s0
    for (int i = 0; i < len0; i++)
        cost[i] = i;

    // dynamically computing the array of distances

    // transformation cost for each letter in s1
    for (int j = 1; j < len1; j++) {

        // initial cost of skipping prefix in String s1
        newcost[0] = j - 1;

        // transformation cost for each letter in s0
        for (int i = 1; i < len0; i++) {

            // matching current letters in both strings
            int match = (s0.charAt(i - 1) == s1.charAt(j - 1)) ? 0 : 1;

            // computing cost for each transformation
            int cost_replace = cost[i - 1] + match;
            int cost_insert = cost[i] + 1;
            int cost_delete = newcost[i - 1] + 1;

            // keep minimum cost
            newcost[i] = Math.min(Math.min(cost_insert, cost_delete),
                    cost_replace);
        }

        // swap cost/newcost arrays
        int[] swap = cost;
        cost = newcost;
        newcost = swap;
    }

    // the distance is the cost for transforming all letters in both strings
    return cost[len0 - 1];
}

}

answered Oct 22 '22 02:10

Vishal Tathe

LevenshteinDistance

It can be used through maven dependency

I do think it is better to use this implementation than write your own one.

<dependency>
    <groupId>org.apache.commons</groupId>
    <artifactId>commons-text</artifactId>
    <version>1.3</version>
</dependency>

As an example, have a look at code below

import org.apache.commons.text.similarity.LevenshteinDistance;

public class MetricUtils {
    private static LevenshteinDistance lv = new LevenshteinDistance();

    public static void main(String[] args) {
        String s = "running";
        String s1 = "runninh";
        System.out.println(levensteinRatio(s, s1));
    }

    public static double levensteinRatio(String s, String s1) {
        return 1 - ((double) lv.apply(s, s1)) / Math.max(s.length(), s1.length());
    }
}

answered Oct 22 '22 00:10

Alex

Related questions
                            
                                Is it good practice to use assert in Java?
                            
                                Stemming English words with Lucene
                            
                                <tt> vs <code> elements when writing Java docs
                            
                                Should HttpURLConnection with CookieManager automatically handle session cookies?
                            
                                Java: Generic method for Enums
                            
                                Does an unused import declaration eat memory, in Java?
                            
                                Why is AutoCloseable the base interface for Closeable (and not vice versa)?
                            
                                Required request body content is missing: org.springframework.web.method.HandlerMethod$HandlerMethodParameter
                            
                                Hibernate @NotEmpty is deprecated
                            
                                Firebase (FCM) registration token in Flutter
                            
                                Form too Large Exception
                            
                                java: How to fix the Unchecked cast warning
                            
                                EntityManager persist() not saving anything to database
                            
                                Remove characters from a String in Java
                            
                                Eclipse is not showing compilation errors in project explorer
                            
                                What is the use of pom.xml file in Java? [closed]
                            
                                How do I know whether to use OnComplete or OnSuccess?
                            
                                How are Kotlin Array's toList and asList different?
                            
                                Check if a file is locked in Java
                            
                                JAXB and constructors

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Similarity Score - Levenshtein

Tags:

java

levenshtein-distance

similarity