Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MongoDB map reduce with query

I got a rather big MongoDB that I need to extract statistics from and I do this buy running a Map Reduce query.

The problem is now that I need to narrow the query to use for example status: 'drafted" instead of using the whole collection.

This is my Map/Reduce code (I am using Codeigniter): I tried to follow the last step in this query but I cannot get results so I think I add the syntax wrong: http://cookbook.mongodb.org/patterns/unique_items_map_reduce/.

$map = new MongoCode ("function() {

                day = Date.UTC(this.created_at.getFullYear(), this.created_at.getMonth(), this.created_at.getDate());

                emit ({day: day, _id: this._id}, {created_at: this.created_at, count: 1});

            }");

            $reduce = new MongoCode ("function( key , values ) {

                var count = 0;

                values.forEach (function(v) {

                    count += v['count'];

                });

                return {count: count};

            }");

            $outer = $this->cimongo->command (array (

                "mapreduce" => "documents",   

                "map"       => $map,   

                "reduce"    => $reduce,  

                "out"       => "stats_results"

            ));


            $map = new MongoCode ("function() {

                emit(this['_id']['day'], {count: 1});

            }");

            $reduce = new MongoCode ("function( key , values ) {

                var count = 0;

                values.forEach (function(v) {

                    count += v['count'];

                });

                return {count: count};

            }");

            $outer = $this->cimongo->command (array (

                "mapreduce" => "stats_results",   

                "map"       => $map,   

                "reduce"    => $reduce,   

                "out"       => "stats_results_unique"

            ));
like image 326
Jonathan Clark Avatar asked Jun 10 '12 13:06

Jonathan Clark


1 Answers

Two things about your question:

1) The example in the cookbook might be a bit too complex for what you're trying to do. Here is a simpler one:

Given a document structure that looks like this:

{
    "url" : "http://example.com/page8580.html",
    "user_id" : "Jonathan.Clark",
    "date" : ISODate("2012-06-11T10:59:36.271Z")
}

Here is some sample JavaScript code to run a map/reduce job that will count the number of visits per distinct URL.

// Map function:

map = function() {
  emit({ url: this.url }, {count: 1});
}

// Reduce function:

reduce = function(key, values) {
    var count = 0;

    values.forEach(
    function(val) { count += val['count']; }
    );

    return {count: count};
};

// Run the Map/Reduce function across the 'pageviews' collection:
// Note that MongoDB will store the results in the 'pages_per_day'
//   collection because the 'out' parameter is present

 db.pageviews.mapReduce( 
    map,        // pass in the 'map' function as an argument
    reduce,     // pass in the 'reduce' function as an argument
    // options
    { out: 'pages_per_day',     // output collection
      verbose: true }       // report extra statistics
);

2) If you want to run the Map/Reduce function on only a subset of the 'pageviews' collection, you can specify a query to the call to 'mapReduce()' in order to restrict the number of documents that the 'map()' function will operate on:

// Run the Map/Reduce function across the 'pageviews' collection, but 
// only report on the pages seen by "Jonathan.Clark"

 db.pageviews.mapReduce( 
    map,        // Use the same map & reduce functions as before
    reduce,     
    { out: 'pages_per_day_1user',       // output to different collection
      query:{ 'user_id': "Jonathan.Clark" }     // query descriptor
      verbose: true }       
);

Note that if you aren't using JavaScript, you'll have to translate these calls into whatever programming language you're using.

3) Here's an example of calling the Map/Reduce function with a query condition using PHP:

$outer = $this->cimongo->command (array (
                "mapreduce" => "pageviews",   
                "map"       => $map,   
                "reduce"    => $reduce,   
                "out"       => "pages_per_day_1user",
                "query"     => array( "user_id" => "Jonathan.Clark" )
            ));

4) For more information about Map/Reduce, see the following references:

  • Map/Reduce Manual page: http://www.mongodb.org/display/DOCS/MapReduce
  • Debugging Map/Reduce: http://www.mongodb.org/display/DOCS/Troubleshooting+MapReduce
  • Using Map/Reduce in PHP: http://us.php.net/manual/en/mongodb.command.php

    --William

like image 148
William Z Avatar answered Sep 30 '22 14:09

William Z