MongoDB distinct too big 16mb cap

Question

I have a Mongodb collection. Simply, it has two columns: user and url. It has 39274590 rows. The key of this table is {user, url}.

Using Java, I try to list distinct urls:

  MongoDBManager db = new MongoDBManager( "Website", "UserLog" );
  return db.getDistinct("url");

But I receive an exception:

Exception in thread "main" com.mongodb.CommandResult$CommandFailure: command failed [distinct]: 
{ "serverUsed" : "localhost/127.0.0.1:27017" , "errmsg" : "exception: distinct too big, 16mb cap" , "code" : 10044 , "ok" : 0.0}

How can I solve this problem? Is there any plan B that can avoid this problem?

Will Shaver · Accepted Answer

In version 2.6 you can use the aggregate commands to produce a separate collection: http://docs.mongodb.org/manual/reference/operator/aggregation/out/

This will get around mongodb's limit of 16mb for most queries. You can read more about using the aggregation framework on large datasets in mongodb 2.6 here: http://vladmihalcea.com/mongodb-2-6-is-out/

To do a 'distinct' query with the aggregation framework, group by the field.

db.userlog.aggregate([{$group: {_id: '$url'} }]);

Note: I don't know how this works for the Java driver, good luck.

gmaniac · Answer

Take a look at this answer

1) The easiest way to do this is via the aggregation framework. This takes two "$group" commands: the first one groups by distinct values, the second one counts all of the distinct values

2) If you want to do this with Map/Reduce you can. This is also a two-phase process: in the first phase we build a new collection with a list of every distinct value for the key. In the second we do a count() on the new collection.

Note that you cannot return the result of the map/reduce inline, because that will potentially overrun the 16MB document size limit. You can save the calculation in a collection and then count() the size of the collection, or you can get the number of results from the return value of mapReduce().

MongoDB distinct too big 16mb cap

Tags:

java

mongodb

Munichong

2 Answers

Will Shaver

gmaniac

Recent Activity

Donate For Us

MongoDB distinct too big 16mb cap

Tags:

java

mongodb

Munichong

2 Answers

Will Shaver

gmaniac

Related questions

Recent Activity

Donate For Us