Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Groovy List : Group By element's count and find highest frequency elements

I have a groovy list as below

def certs = ['0xc1','0xc1','0xc1','0xc1','0xc2','0xc2','0xc3','0xc4','0xc4','0xc5','0xc5','0xc5','0xc5']

Am trying to find the occurance of each element and group by its count. I've tried

certs.groupBy { it }.findAll { it.value.size() }

but am getting the below output

[0xc1:[0xc1, 0xc1, 0xc1, 0xc1], 0xc2:[0xc2, 0xc2], 0xc3:[0xc3], 0xc4:[0xc4, 0xc4], 0xc5:[0xc5, 0xc5, 0xc5, 0xc5]]

Instead am expecting below

[0xc1:4, 0xc2:2, 0xc3:1, 0xc4:2, 0xc5:4]

can someone help me with this? Also I wanna find maximum occurring element in the list in my case its 0xc1 and 0xc5

UPDATE:

def myMap = certs.inject([:]) { m, x -> if (!m[x]) m[x] = 0; m[x] += 1; m }
def maxValue = myMap.values().max{it} 
def myKeys = []
myMap.findAll{ it.value == maxValue }.each{myKeys << it?.key}
println myKeys  // result = [0xc1:4, 0xc5:4]
//println myMap.sort { a, b -> b.value <=> a.value }
like image 350
RanPaul Avatar asked Jun 05 '15 19:06

RanPaul


2 Answers

Map counts = certs.countBy { it }
counts.findAll { it.value == counts.values().max() }

or by an one-liner

certs.countBy { it }.groupBy { it.value }.max { it.key }.value.keySet()
like image 146
dmahapatro Avatar answered Oct 09 '22 21:10

dmahapatro


There are several ways to do this. A good place to start learning Groovy's methods on collections is with collect and inject.

The method collect generates a new collection for the old one, taking a closure that describes how to change each element of the existing collection to get a new element for the new collection.

The method inject generates a new object given a collection. It takes a closure that takes two arguments, one for a running total object and one for a member of the current collection, where the body of the closure shows how to modify the running total for the passed-in member of the collection. A common example is summing up a list of numbers (although there's a convenience method, sum, for this case).

So you could get the map of counts using inject:

m = certs.inject([:]) { m, x -> if (!m[x]) m[x] = 0; m[x] += 1; m }

this executes the closure for each entry in the certs map, incrementing the value for the same key in the new map, resulting in

[0xc1:4, 0xc2:2, 0xc3:1, 0xc4:2, 0xc5:4]

This is pretty ugly, though. The closure code is not straightforward, I have to return the map from the closure so that it will update the running total.

Starting with groupBy generates a map, It's just not exactly the map you want. There's a method like the collect method, but specialized for maps, called collectEntries, that lets you transform elements from one collection or map, generating a new map from it:

certs.groupBy().collectEntries { [(it.key) : it.value.size()] }

But both of these are unnecessary for this since Groovy 1.8, which added a countBy method that does this much more cleanly, see this other answer for a better way.

Once you have the map generated, finding the entries with the greatest value could be done with

maxSize = m.values().max
m.entrySet().findAll { it.value == maxSize }
like image 36
Nathan Hughes Avatar answered Oct 09 '22 22:10

Nathan Hughes