I am trying to sort a map based on word frequency (i.e., based on value). For that I have overridden comparator and passed to TreeMap
, but I am getting this weird output.
public class WordFrequency {
public static String sentence = "one three two two three three four four four";
public static Map<String, Integer> map;
public static void main(String[] args) {
map = new HashMap<>();
String[] words = sentence.split("\\s");
for (String word : words) {
Integer count = map.get(word);
if (count == null) {
count = 1;
} else {
++count;
}
map.put(word, count);
}
Comparator<String> myComparator = new Comparator<String>() {
@Override
public int compare(String s1, String s2) {
if (map.get(s1) < map.get(s2)) {
return -1;
} else if (map.get(s1) > map.get(s2)) {
return 1;
} else {
return 0;
}
}
};
SortedMap<String, Integer> sortedMap = new TreeMap<String, Integer>(myComparator);
System.out.println("Before sorting: " + map);
sortedMap.putAll(map);
System.out.println("After Sorting based on value:" + sortedMap);
}
}
Output:
Before sorting: {two=2, one=1, three=3, four=3}
After sorting based on value:{one=1, two=2, three=3}
Expected Output:
{one=1, two=2, four=3,three=3}
Your compare
method fails to obey the contract of the Map interface, since it compares values instead of keys. Your implementation causes two keys with the same value to be considered the same key. Therefore your sortedMap
doesn't contain the "four" key, which has the same value as the "three" key.
Note that the ordering maintained by a tree map, like any sorted map, and whether or not an explicit comparator is provided, must be consistent with equals if this sorted map is to correctly implement the Map interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Map interface is defined in terms of the equals operation, but a sorted map performs all key comparisons using its compareTo (or compare) method, so two keys that are deemed equal by this method are, from the standpoint of the sorted map, equal. The behavior of a sorted map is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Map interface.
TreeMap reference
You can fix this problem by comparing the keys when the values are equal :
Comparator<String> myComparator = new Comparator<String>() {
@Override
public int compare(String s1, String s2) {
if (map.get(s1) < map.get(s2)) {
return -1;
} else if (map.get(s1) > map.get(s2)) {
return 1;
} else {
return s1.compareTo(s2);
}
}
};
This should give you an output of :
After sorting based on value:{one=1, two=2, four=3, three=3}
Since four<three
based on the natural ordering of Strings.
Because of your compare()
is consider values only in the Map
. Then three=3, four=3
has same value 3
. Then those consider as duplicates when they add to TreeMap
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With