Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to compare almost similar Strings in Java? (String distance measure) [closed]

I would like to compare two strings and get some score how much these look alike. For example "The sentence is almost similar" and "The sentence is similar".

I'm not familiar with existing methods in Java, but for PHP I know the levenshtein function.

Are there better methods in Java?

like image 964
hsmit Avatar asked Jan 18 '10 08:01

hsmit


People also ask

How do you measure string similarity?

The way to check the similarity between any data point or groups is by calculating the distance between those data points. In textual data as well, we check the similarity between the strings by calculating the distance between one text to another text.

What are the 3 ways to compare two string objects?

There are three ways to compare String in Java: By Using equals() Method. By Using == Operator. By compareTo() Method.

How do you check if a string is the same as another string Java?

Java String equals() Method The equals() method compares two strings, and returns true if the strings are equal, and false if not. Tip: Use the compareTo() method to compare two strings lexicographically.

Can we compare 2 strings using == in Java?

To compare these strings in Java, we need to use the equals() method of the string. You should not use == (equality operator) to compare these strings because they compare the reference of the string, i.e. whether they are the same object or not.


1 Answers

The following Java libraries offer multiple compare algorithms (Levenshtein,Jaro Winkler,...):

  1. Apache Commons Lang 3: https://commons.apache.org/proper/commons-lang/
  2. Simmetrics: http://sourceforge.net/projects/simmetrics/

Both libraries have a java documentation (Apache Commons Lang Javadoc,Simmetrics Javadoc).

//Usage of Apache Commons Lang 3 import org.apache.commons.lang3.StringUtils;    public double compareStrings(String stringA, String stringB) {     return StringUtils.getJaroWinklerDistance(stringA, stringB); }   //Usage of Simmetrics import uk.ac.shef.wit.simmetrics.similaritymetrics.JaroWinkler     public double compareStrings(String stringA, String stringB) {     JaroWinkler algorithm = new JaroWinkler();     return algorithm.getSimilarity(stringA, stringB); } 
like image 135
FiveO Avatar answered Sep 20 '22 13:09

FiveO