I wonder how mongodb compare the "_id" field when doing query like the following:
db.data.find({"_id":{$gt:ObjectId("502aa46c0674d23e3cee6152")}}).sort({"_id":1}).limit(10);
Is it purely based on timestamp portion of the id?
To perform sort on ObjectId column, use sort().
To sort documents in MongoDB, you need to use sort() method. The method accepts a document containing a list of fields along with their sorting order. To specify sorting order 1 and -1 are used. 1 is used for ascending order while -1 is used for descending order.
MongoDB can perform sort operations on a single-field index in ascending or descending order. In compound indexes, the sort order determines whether the index can be sorted.
To expand slightly on what Andre said:
Since the ObjectID timestamp is only to the second, two (or more) ObjectIDs could easily be created with the same value for the timestamp (the first 4 bytes). If these were created on the same machine (machine ID - the next 3 bytes), by the same process (PID - the next 2 bytes), then the only thing to differentiate them would be the "inc" field, the last 3 bytes at the end.
Update: Jan 2020
This answer continues to be popular so it is worth updating a little. The ObjectID spec has evolved since this answer was written 8 years ago and the 5 bytes after the timestamp are now simply random, which will greatly decrease the likelihood of any collisions. The last three bytes are still incremental, but initialised at a random value to start, again making collisions less likely. The ObjectID now contains less context (you can't easily tell where it was generated and by what process) but I would guess that the information was not being used in any meaningful way and has been deprecated in favor of better randomisation of the ID.
End Update
See here for the full spec:
https://docs.mongodb.com/manual/reference/method/ObjectId/#ObjectIDs-BSONObjectIDSpecification
That "inc" field is either an ever incrementing field (then you can reasonably expect the sort to be in the insert/create order) or a random value (then likely unique, but not ordered), assuming the spec is implemented correctly of course. Note that the ObjectIDs may be generated by the driver, or the application (or indeed manually) rather than by MongoDB itself, so unless you have full control over how they are generated, then any or all of the above may apply.
In a way you are correct, if you sort by the _id
you will sort by the insertion time. This does not mean that the only comparison is done on the timestamp portion. ObjectID's are a BSON object type in their own right, they can be directly compared with each other. As they start with a timestamp, it follows logically that those in the past will be less than those in the future.
You can find more detail in the documentation
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With