Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Does MongoDB find() query return documents sorted by creation time?

I need documents sorted by creation time (from oldest to newest).

Since ObjectID saves timestamp by default, we can use it to get documents sorted by creation time with CollectionName.find().sort({_id: 1}).

Also, I noticed that regular CollectionName.find() query always returns the documents in same order as CollectionName.find().sort({_id: 1}).

My question is:

Is CollectionName.find() guaranteed to return documents in same order as CollectionName.find().sort({_id: 1}) so I could leave sorting out?

like image 295
Drag0 Avatar asked Dec 25 '22 18:12

Drag0


2 Answers

No. Well, not exactly. A db.collection.find() will give you the documents in the order they appear in the data files most of the times, though this isn't guaranteed.

Result Ordering

Unless you specify the sort() method or use the $near operator, MongoDB does not guarantee the order of query results.

As long as your data files are relatively new and few updates happen, the documents might (and most of the times will) be returned in what appears to be sorted by _id since ObjectId is monotonically increasing.

Later in the lifecycle, old documents may have been moved from their old position (because they increased in size and documents are never partitioned) and new ones are written in the place formerly occupied by another document. In this case, a newer document may be returned in a position between two old documents.

There is nothing wrong with sorting documents by _id, since the index will be used for that, adding only some latency for document retrieval.

However, I would strongly recommend against using the ObjectId for date operations for several reasons:

  1. ObjectIds can not be used for date comparison queries. So you couldn't query for all documents created between date x and date y. To archive that, you'd have to load all documents, extract the date from the ObjectId and compare it – which is extremely inefficient.
  2. If the creation date matters, it should be explicitly addressable in the documents
  3. I see ObjectIds as a choice of last resort for the _id field and tend to use other values (compound on occasions) as _ids, since the field is indexed by default and it is very likely that one can save precious RAM by using a more meaningful value as id.

You could use the following for example which utilizes DBRefs

{
  _id: {
    creationDate: new ISODate(),
    user: { 
      "$ref" : "creators",
      "$id" : "mwmahlberg",
      "$db" : "users"
    }
  }
}

And do a quite cheap sort by using

db.collection.find().sort({_id.creationDate:1})
like image 180
Markus W Mahlberg Avatar answered Dec 28 '22 19:12

Markus W Mahlberg


Is CollectionName.find() guaranteed to return documents in same order as CollectionName.find().sort({_id: 1})

No, it's not! If you didn't specify any order, then a so-called "natural" ordering is used. Meaning that documents will be returned in the order in which they physically appear in data files.

Now, if you only insert documents and never modify them, this natural order will coincide with ascending _id order. Imagine, however, that you update a document in such a way that it grows in size and has to be moved to a free slot inside of a data file (usually this means somewhere at the end of the file). If you were to query documents now, they wouldn't follow any sensible (to an external observer) order.

So, if you care about order, make it explicit.

Source: http://docs.mongodb.org/manual/reference/glossary/#term-natural-order

natural order

The order in which the database refers to documents on disk. This is the default sort order. See $natural and Return in Natural Order.

Testing script (for the confused)

> db.foo.insert({name: 'Joe'})
WriteResult({ "nInserted" : 1 })

> db.foo.insert({name: 'Bob'})
WriteResult({ "nInserted" : 1 })

> db.foo.find()
{ "_id" : ObjectId("55814b944e019172b7d358a0"), "name" : "Joe" }
{ "_id" : ObjectId("55814ba44e019172b7d358a1"), "name" : "Bob" }

> db.foo.update({_id: ObjectId("55814b944e019172b7d358a0")}, {$set: {answer: "On a sharded collection the $natural operator returns a collection scan sorted in natural order, the order the database inserts and stores documents on disk. Queries that include a sort by $natural order do not use indexes to fulfill the query predicate with the following exception: If the query predicate is an equality condition on the _id field { _id: <value> }, then the query with the sort by $natural order can use the _id index. You cannot specify $natural sort order if the query includes a $text expression."}})
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })

> db.foo.find()
{ "_id" : ObjectId("55814ba44e019172b7d358a1"), "name" : "Bob" }
{ "_id" : ObjectId("55814b944e019172b7d358a0"), "name" : "Joe", "answer" : "On a sharded collection the $natural operator returns a collection scan sorted in natural order, the order the database inserts and stores documents on disk. Queries that include a sort by $natural order do not use indexes to fulfill the query predicate with the following exception: If the query predicate is an equality condition on the _id field { _id: <value> }, then the query with the sort by $natural order can use the _id index. You cannot specify $natural sort order if the query includes a $text expression." }
like image 29
Sergio Tulentsev Avatar answered Dec 28 '22 19:12

Sergio Tulentsev