Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

mongoid, how to merge result with map/reduce

I try to use out reduce but I do not know how to call output . Example:

@results = Article.collection.map_reduce(map, reduce, :out => 'test')

@results.find()
 => <Mongo::Cursor:0x2c276c4 namespace='myapp_development.test' @selector={} @cursor_id=> 

When I try :

@results1 = Article.collection.map_reduce(map, reduce, :out =>{reduce: 'test'}

Expected result is duplicate @results. But I run @results1.find().to_a I see it equal with @results

And how can I call result by namespace in rubyrails using mongoid ??

like image 282
Peter89 Avatar asked Feb 20 '23 05:02

Peter89


1 Answers

TLTR: You can't have a duplicate result.

When a map function emit something, and when the reduce function return something, it is a couple [key, value]. When stored into a collection, for each entry, the output is represented as a Mongoid document like this :

{
  "_id" => "my key",
  "value" => "my value"
}

I want to be clear on that point : the Key is stored as the _id, so it is unique in a collection.

See the Map/Reduce output options to know how MongoDb can deal with duplicate key when outputing into an existing collection :

  • replace (the default) : the content of existing collection is dropped, and the output go into it

  • merge : The existing collection is kept. When a result with the same Key (the _id) exists, it is replaced with the freshly map/reduced one.

  • reduce : The existing collection is kept. When a result with the same Key (the _id) exists, MongoDB takes it, and takes the freshly map/reduced one, and run the reduce function against the two of them, and stores the result.

So, you can't have a duplicate result.

Edit:

I respond here to "Can you show me how to apply output reduce , and how to call a collection result" (because the response is quite long) :

There is many way. Let's take an example among others :

class Post
  include Mongoid::Document
  include Mongoid::Timestamps

  field :tags, :type => Array
end

Post.create(:tags => ["Dog", "Cat"])
Post.create(:tags => ["Dog", "Puppy"])

Let's map/reduce this :

map = %Q{
  function() {
    this.tags.forEach(function(tag){
      emit(tag, { count: 1 });
    }); 
  }
}

reduce = %Q{
  function(key, values) {
    var result = { count: 0 };
    values.forEach(function(value) {
      result.count += value.count;
    });
    return result;
  }
}

Post.map_reduce(map, reduce).out(replace: "tags")

Ok, that put the result in a collection named "tags" overwriting it.

We can create a Model to access it :

class Tag
  include Mongoid::Document

  field :value, :type => Hash
end

dog = Tag.find("Dog")
dog._id # => "Dog"
dog.value["count"] # => 2

For the fun, let's say you keep the timestamp of the last time you did the map/reduce. You can do with reduce to do it incrementaly :

Post.where(:created_at.gt => Time.at(my_timestamp)).map_reduce(map, reduce).out(reduce: "tags")

** Edit: fixed map function **

like image 129
Maxime Garcia Avatar answered Feb 22 '23 17:02

Maxime Garcia