Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to get top N keys(sorted by values) in a HashMap

Tags:

java

The original data looks like this:

String data = "{ \"a\":1, \"b\":3 , \"c\":-1 }";

My first step is to convert it into a HashMap:

Gson gson = new Gson();
HashMap<String, Double> map = gson.fromJson(data, HashMap.class);

And then sort the keys by their values:

public static List<String> sortHashMap(final HashMap<String, Double> map) {
    Set<String> set = map.keySet();
    List<String> keys = new ArrayList<String>(set);

    Collections.sort(keys, new Comparator<String>() {

        @Override
        public int compare(String s1, String s2) {
            if (map.get(s1) < map.get(s2)) {
                return 1;
            }
            return 0;
        }
    });

    return keys;
}

At last, get top N keys:

keys.subList(0, N);

I finally get the result, but I don't think it's an elegant way.

So I wonder, is there any convenient way to make it ?

like image 361
WoooHaaaa Avatar asked Sep 24 '13 02:09

WoooHaaaa


2 Answers

A more elegant and scalable approach would be to use a priority queue where the size is limited to N. Using a min-heap priority queue, we can keep adding entries to the queue till the size reaches N. For each entry after the size of the priority queue has reached N, add it to the queue and then remove the element at the head of the queue (which will have the minimum value). After we have exhausted all the entries from the HashMap, the queue will contain the Top N entries.

The advantage of this approach is that even if the entire HashMap cannot fit in memory, we can break it into smaller blocks and use this approach. Also, if we have a concurrent priority queue we can simultaneously add entries to the queue from different HashMaps as well.

public static List<String> topNKeys(final HashMap<String, Double> map, int n) {
    PriorityQueue<String> topN = new PriorityQueue<String>(n, new Comparator<String>() {
        public int compare(String s1, String s2) {
            return Double.compare(map.get(s1), map.get(s2));
        }
    });

    for(String key:map.keySet()){
        if (topN.size() < n)
            topN.add(key);
        else if (map.get(topN.peek()) < map.get(key)) {
            topN.poll();
            topN.add(key);
        }
    }
    return (List) Arrays.asList(topN.toArray());
}
like image 104
Denny Abraham Cheriyan Avatar answered Oct 12 '22 23:10

Denny Abraham Cheriyan


What you've done is OK; you're going to have to write a custom Comparator somewhere, and where you've used it is fine.

But you have a bug in your compare() method: You are returning 0 if s1 > s2, but you should only do that if the numbers are equal and return a negative number if s1 > s2. The below implementation corrects that.

A better (and simpler) implementation is:

 public int compare(String s1, String s2) {
     return Double.compare(map.get(s2), map.get(s1)); //reverse order
 }
like image 36
Bohemian Avatar answered Oct 12 '22 22:10

Bohemian