Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to group and sum arrays in Ruby?

Tags:

arrays

ruby

I have an array of arrays like this:

ar = [[5, "2014-01-27"],
[20, "2014-01-28"],
[5, "2014-01-28"],
[10, "2014-01-28"],
[15, "2014-01-29"],
[5, "2014-01-29"],
[5, "2014-01-30"],
[10, "2014-01-30"],
[5, "2014-01-30"]]

What I ultimately need to do is group the array items by date and sum up the numbers in the first item of each sub-array.

So output would be something like:

[[5, "2014-01-27"],
[35, "2014-01-28"],
[20, "2014-01-29"],
[20, "2014-01-30"]]
like image 211
Shpigford Avatar asked Feb 02 '14 04:02

Shpigford


People also ask

How do you sum an array in Ruby?

Ruby has a method called inject. It takes each element in an enumerator and accumulates it sequentially. The inject method takes the first element in the array and treats it as the initial sum value. The method continues to iterate all the elements in the array, adding each of them together.

What does Group_by do in Ruby?

The group_by() of enumerable is an inbuilt method in Ruby returns an hash where the groups are collectively kept as the result of the block after grouping them. In case no block is given, then an enumerator is returned. Parameters: The function takes an optional block according to which grouping is done.

What is .first in Ruby?

Ruby | Array class first() function first() is a Array class method which returns the first element of the array or the first 'n' elements from the array.


2 Answers

h = ar.group_by(&:last)
h.keys.each{|k| h[k] = h[k].map(&:first).inject(:+)}
h.map(&:reverse)
like image 39
sawa Avatar answered Oct 02 '22 16:10

sawa


ar.group_by(&:last).map{ |x, y| [y.inject(0){ |sum, i| sum + i.first }, x] }

Edit to add explanation:
We group by the last value (the date) yielding a hash:

{"2014-01-27"=>[[5, "2014-01-27"]], "2014-01-28"=>[[20, "2014-01-28"], [5, "2014-01-28"], [10, "2014-01-28"]], "2014-01-29"=>[[15, "2014-01-29"], [5, "2014-01-29"]], "2014-01-30"=>[[5, "2014-01-30"], [10, "2014-01-30"], [5, "2014-01-30"]]}

Then map that with x as they hash key, and y as the array of [[number, date], [number, date]] pairs.

.inject(0) means sum starts out as 0, then we add the first item of each array (the number) to that sum until all arrays are iterated and all the numbers are added.

Then we do [y, x] where x is the hash key (the date), and y is the sum of all the numbers.

This method is efficient as we use inject to avoid mapping the array twice and don't have to reverse the values afterwards since we swapped their positions while mapping it.

Edit: Interestingly the benchmarks between @bjhaid and my answer are close:

    user     system      total        real
5.117000   0.000000   5.117000 (  5.110292)
5.632000   0.000000   5.632000 (  5.644323)

1000000 iterations - my method was the slowest

like image 129
user21033168 Avatar answered Oct 02 '22 14:10

user21033168