Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

In MongoDB mapreduce, how can I flatten the values object?

I'm trying to use MongoDB to analyse Apache log files. I've created a receipts collection from the Apache access logs. Here's an abridged summary of what my models look like:

db.receipts.findOne() {     "_id" : ObjectId("4e57908c7a044a30dc03a888"),     "path" : "/videos/1/show_invisibles.m4v",     "issued_at" : ISODate("2011-04-08T00:00:00Z"),     "status" : "200" } 

I've written a MapReduce function that groups all data by the issued_at date field. It summarizes the total number of requests, and provides a breakdown of the number of requests for each unique path. Here's an example of what the output looks like:

db.daily_hits_by_path.findOne() {     "_id" : ISODate("2011-04-08T00:00:00Z"),     "value" : {         "count" : 6,         "paths" : {             "/videos/1/show_invisibles.m4v" : {                 "count" : 2             },             "/videos/1/show_invisibles.ogv" : {                 "count" : 3             },             "/videos/6/buffers_listed_and_hidden.ogv" : {                 "count" : 1             }         }     } } 

How can I make the output look like this instead:

{     "_id" : ISODate("2011-04-08T00:00:00Z"),     "count" : 6,     "paths" : {         "/videos/1/show_invisibles.m4v" : {             "count" : 2         },         "/videos/1/show_invisibles.ogv" : {             "count" : 3         },         "/videos/6/buffers_listed_and_hidden.ogv" : {             "count" : 1         }     } } 
like image 567
nelstrom Avatar asked Aug 31 '11 13:08

nelstrom


People also ask

What is mapReduce function in MongoDB?

In MongoDB, map-reduce is a data processing programming model that helps to perform operations on large data sets and produce aggregated results. MongoDB provides the mapReduce() function to perform the map-reduce operations. This function has two main functions, i.e., map function and reduce function.

How mapReduce is implemented in MongoDB?

In this map-reduce operation, MongoDB applies the map phase to each input document (i.e. the documents in the collection that match the query condition). The map function emits key-value pairs. For those keys that have multiple values, MongoDB applies the reduce phase, which collects and condenses the aggregated data.


2 Answers

It's not currently possible, but I would suggest voting for this case: https://jira.mongodb.org/browse/SERVER-2517.

like image 155
mstearn Avatar answered Sep 22 '22 17:09

mstearn


Taking the best from previous answers and comments:

db.items.find().hint({_id: 1}).forEach(function(item) {     db.items.update({_id: item._id}, item.value); }); 

From http://docs.mongodb.org/manual/core/update/#replace-existing-document-with-new-document
"If the update argument contains only field and value pairs, the update() method replaces the existing document with the document in the update argument, except for the _id field."

So you need neither to $unset value, nor to list each field.

From https://docs.mongodb.com/manual/core/read-isolation-consistency-recency/#cursor-snapshot "MongoDB cursors can return the same document more than once in some situations. ... use a unique index on this field or these fields so that the query will return each document no more than once. Query with hint() to explicitly force the query to use that index."

like image 24
Denis Ryzhkov Avatar answered Sep 21 '22 17:09

Denis Ryzhkov