Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Something like HashMap but sorted?

Tags:

java

hashmap

I'm writing a Java program that parses all the words from a text file and then adds them to a HashMap. I need to count how many distinct words are contained in the file. I also need to figure out the highest counted words. The HashMap is comprised of each word mapped to an integer which represents how many times the word occurs.

Is there something like HashMap that will help me sort this?

like image 658
Jenny Avatar asked Dec 14 '22 01:12

Jenny


2 Answers

The Manual way to do it is as follows:

  • Create a composite WordCount class with word and count fields.
  • Create a Comparator for that class that sorts by count.
  • When you're done filling your HashMap, create a new List of WordCount objects created from values in the HashMap.
  • Sort the List using your comparator.
like image 114
Mark Bolusmjak Avatar answered Dec 23 '22 06:12

Mark Bolusmjak


You could use a HashMultiset from google-collections:

import com.google.common.collect.*;
import com.google.common.collect.Multiset.Entry;

...

  final Multiset<String> words = HashMultiset.create();
  words.addAll(...);

  Ordering<Entry<String>> byIncreasingCount = new Ordering<Entry<String>>() {
    @Override public int compare(Entry<String> a, Entry<String> b) {
      // safe because count is never negative
      return left.getCount() - right.getCount();
    }
  });

  Entry<String> maxEntry = byIncreasingCount.max(words.entrySet())
  return maxEntry.getElement();

EDIT: oops, I thought you wanted only the single most common word. But it sounds like you want the several most common -- so, you could replace max with sortedCopy and now you have a list of all the entries in order.

To find the number of distinct words: words.elementSet().size()

like image 29
Kevin Bourrillion Avatar answered Dec 23 '22 06:12

Kevin Bourrillion