I have just started with MongoDB and mongoid. The biggest problem I'm having is understanding the map/reduce functionality to be able to do some very basic grouping and such.
Lets say I have model like this:
class Person
include Mongoid::Document
field :age, type: Integer
field :name
field :sdate
end
That model would produce objects like these:
#<Person _id: 9xzy0, age: 22, name: "Lucas", sdate: "2013-10-07">
#<Person _id: 9xzy2, age: 32, name: "Paul", sdate: "2013-10-07">
#<Person _id: 9xzy3, age: 23, name: "Tom", sdate: "2013-10-08">
#<Person _id: 9xzy4, age: 11, name: "Joe", sdate: "2013-10-08">
Could someone show how to use mongoid map reduce to get a collection of those objects grouped by the sdate field? And to get the sum of ages of those that share the same sdate field?
I'm aware of this: http://mongoid.org/en/mongoid/docs/querying.html#map_reduce But somehow it would help to see that applied to a real example. Where does that code go, in the model I guess, is a scope needed, etc.
I can make a simple search with mongoid, get the array and manually construct anything I need but I guess map reduce is the way here. And I imagine these js functions mentioned on the mongoid page are feeded to the DB that makes those operations internally. Coming from active record these new concepts are a bit strange.
I'm on Rails 4.0, Ruby 1.9.3, Mongoid 4.0.0, MongoDB 2.4.6 on Heroku (mongolab) though I have locally 2.0 that I should update.
Thanks.
Map-reduce is a data processing paradigm for condensing large volumes of data into useful aggregated results. To perform map-reduce operations, MongoDB provides the mapReduce database command.
In MongoDB, map-reduce is a data processing programming model that helps to perform operations on large data sets and produce aggregated results. MongoDB provides the mapReduce() function to perform the map-reduce operations. This function has two main functions, i.e., map function and reduce function.
Map-reduce is a common pattern when working with Big Data – it's a way to extract info from a huge dataset. But now, starting with version 2.2, MongoDB includes a new feature called Aggregation framework. Functionality-wise, Aggregation is equivalent to map-reduce but, on paper, it promises to be much faster.
An aggregation pipeline provides better performance and usability than a map-reduce operation. Map-reduce operations can be rewritten using aggregation pipeline operators, such as $group , $merge , and others.
Taking the examples from http://mongoid.org/en/mongoid/docs/querying.html#map_reduce and adapting them to your situation and adding comments to explain.
map = %Q{
function() {
emit(this.sdate, { age: this.age, name : this. name });
// here "this" is the record that map
// is going to be executed on
}
}
reduce = %Q{
function(key, values) {
// this will be executed for every group that
// has the same sdate value
var result = { avg_of_ages: 0 };
var sum = 0; // sum of all ages
var totalnum = 0 // total number of people
values.forEach(function(value) {
sum += value.age;
});
result.avg_of_ages = sum/total // finding the average
return result;
}
}
results = Person.map_reduce(map, reduce) //You can access this as an array of maps
first_average = results[0].avg_of_ages
results.each do |result|
// do whatever you want with result
end
Though i would suggest you use Aggregation and not map reduce for such a simple operation. The way to do this is as follows :
results = Person.collection.aggregate([{"$group" => { "_id" => {"sdate" => "$sdate"},
"avg_of_ages"=> {"$avg" : "$age"}}}])
and the result will be almost identical with map reduced and you would have written a lot less code.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With