Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

While sorting the map based on value, some values are missing. What causes this weird behaviour?

I am trying to sort a map based on word frequency (i.e., based on value). For that I have overridden comparator and passed to TreeMap, but I am getting this weird output.

public class WordFrequency {
    public static String sentence = "one three two two three three four four four";
    public static Map<String, Integer> map;

    public static void main(String[] args) {
        map = new HashMap<>();
        String[] words = sentence.split("\\s");

        for (String word : words) {
            Integer count = map.get(word);
            if (count == null) {
                count = 1;
            } else {
                ++count;
            }
            map.put(word, count);
        }

        Comparator<String> myComparator = new Comparator<String>() {

            @Override
            public int compare(String s1, String s2) {
                if (map.get(s1) < map.get(s2)) {
                    return -1;
                } else if (map.get(s1) > map.get(s2)) {
                    return 1;
                } else {
                    return 0;
                }
            }

        };
        SortedMap<String, Integer> sortedMap = new TreeMap<String, Integer>(myComparator);
        System.out.println("Before sorting: " + map);
        sortedMap.putAll(map);
        System.out.println("After Sorting based on value:" + sortedMap);

    }
}

Output:

Before sorting: {two=2, one=1, three=3, four=3}
After sorting based on value:{one=1, two=2, three=3}

Expected Output:

{one=1, two=2, four=3,three=3}
like image 305
Sagar Pudi Avatar asked Oct 31 '14 04:10

Sagar Pudi


2 Answers

Your compare method fails to obey the contract of the Map interface, since it compares values instead of keys. Your implementation causes two keys with the same value to be considered the same key. Therefore your sortedMap doesn't contain the "four" key, which has the same value as the "three" key.

Note that the ordering maintained by a tree map, like any sorted map, and whether or not an explicit comparator is provided, must be consistent with equals if this sorted map is to correctly implement the Map interface. (See Comparable or Comparator for a precise definition of consistent with equals.) This is so because the Map interface is defined in terms of the equals operation, but a sorted map performs all key comparisons using its compareTo (or compare) method, so two keys that are deemed equal by this method are, from the standpoint of the sorted map, equal. The behavior of a sorted map is well-defined even if its ordering is inconsistent with equals; it just fails to obey the general contract of the Map interface.

TreeMap reference

You can fix this problem by comparing the keys when the values are equal :

    Comparator<String> myComparator = new Comparator<String>() {

        @Override
        public int compare(String s1, String s2) {
            if (map.get(s1) < map.get(s2)) {
                return -1;
            } else if (map.get(s1) > map.get(s2)) {
                return 1;
            } else {
                return s1.compareTo(s2);
            }
        }

    };

This should give you an output of :

After sorting based on value:{one=1, two=2, four=3, three=3}

Since four<three based on the natural ordering of Strings.

like image 131
Eran Avatar answered Sep 27 '22 17:09

Eran


Because of your compare() is consider values only in the Map. Then three=3, four=3 has same value 3. Then those consider as duplicates when they add to TreeMap.

like image 28
Ruchira Gayan Ranaweera Avatar answered Sep 27 '22 17:09

Ruchira Gayan Ranaweera