Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Comparing two Strings in java and identifying duplicate words

I'm trying be able to compare two Strings and identify duplicate words. For example;

String1 = "Hello, my name is John."
String2 = "Can you tell me your name please?"

Comparing String1 and String2 would return the word; "name".

I know it is possible to split these two strings into an array of words, and then iterate over each word of each String in a 2-D array. However this is computationally expensive at O(n^2) and I was wondering if there is a faster way of doing this?

Thanks.

EDIT: Changed the example for clarity.

like image 544
Dan_Dan_Man Avatar asked Jan 08 '13 16:01

Dan_Dan_Man


People also ask

Can we compare 2 strings using == in Java?

To compare these strings in Java, we need to use the equals() method of the string. You should not use == (equality operator) to compare these strings because they compare the reference of the string, i.e. whether they are the same object or not.

How do I compare two strings in Word in Java?

One solution is to use Java compareTo() method. The method compareTo() is used for comparing two strings lexicographically in Java.


2 Answers

After getting the strings to word arrays:

You can add all the elements in the first array to a hashmap and then scan the second array to see if each of the elements exists in the hashmap. Since access time to a hashmap is O(1), this will be O(n+m) time complexity.

If you do not want to use extra space, you can sort both of the arrays in O(nlogn) and then compare the items in O(n+m) which would give you O(nlogn) in total.

like image 60
Faruk Sahin Avatar answered Sep 19 '22 18:09

Faruk Sahin


One simple solution is to use the Sets.intersection method of Guava's Sets. It is pretty easy:

String s1 = "Hello, my name is John.";
String s2 = "Can you tell me your name?";
Splitter splitter = Splitter.onPattern("\\W").trimResults().omitEmptyStrings();
Set<String> intersection = Sets.intersection(//
        Sets.newHashSet(splitter.split(s1)), //
        Sets.newHashSet(splitter.split(s2)));
System.out.println(intersection);

Output:

[name]

You can also find more information on algorithms to detect Set intersection on this thread.

like image 41
Alex Avatar answered Sep 21 '22 18:09

Alex