Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select Element with Max Occurrence in Multiset

I may want to rephrase the question to "How do I select first item in a Multiset?" because it seems Multiset is already ordered according to frequencies.

I have a Multiset myList = Multiset.create();

[maa00 mfnt11 malignlft mbold mlt18 mfl x 3, caa00 cfnt11 calignlft cbold clt17 cfl]

I could not find any method like myList.getIndex(0). Please note, in the end, I need the count of element that has maximum frequency.

Is there any one liner for this ? Or do I have to do that iteration?

Update : I am getting maximum frequency using :

myList.count(Multisets.copyHighestCountFirst(myList).asList().get(0)));

But this is too slow. Can you please suggest, what exactly should I use?

Update 1: Using above copyHighestCountFirst method is proving too slow. In one instance of loop, it is taking 80+milliseconds opposed to average 40 milliseconds using without it. In large loops, should I prefer simple iteration?

Update 2 : Got it working using :

myList.count(myList.entrySet().iterator().next().getElement())

Without almost zero impact on performance. I still wonder if there is any better way to do it.

Sidenote : In Python I did the same with :

j = defaultdict(int)
for k in clList:
    j[k] +=1
result1 = max(j.iteritems(), key=lambda x:x[1]) //count of frequency of item with max count
like image 716
akshayb Avatar asked May 23 '13 14:05

akshayb


2 Answers

There's been a lot of alternatives thrown around between your question and the other answer posted, but many of them seem to depend on the idea that .get(0) or .iterator().next() is going to get you the most frequent element. It will not!

Your only two decent choices are Multisets.copyHighestCountFirst(bag).elementSet().iterator().next(), which is wasteful as you say, or loop through the entrySet manually and check each to see if it's the most frequent so far.

You should file a Guava feature request to extract the most frequent element. I can't promise what will happen to it, but it's worth requesting.

like image 153
Kevin Bourrillion Avatar answered Sep 18 '22 14:09

Kevin Bourrillion


I was running into a similar challenge today in trying to find a simple, reasonably efficient way to find the element in a Multiset with the maximum count. In the future we live in with Java 8, I was able to modify Louis Wasserman's solution into a clean one liner:

multiset.entrySet().stream().max(Ordering.natural().onResultOf(Multiset.Entry::getCount)).get();

This will give you the Multiset.Entry with the maximum count (assuming multiset isn't empty) allowing you to access either the element or its count.

like image 34
Spencer Kelley Avatar answered Sep 20 '22 14:09

Spencer Kelley