Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting and limiting MongoDB queries by insertion order in a multi-process environment

Tags:

mongodb

I am using MongoDB in a multi-process environment and I was wondering how I can sort queries by insertion order and limit it to documents inserted after a certain document. On a single process I can use the ObjectID but otherwise two ObjectIds from different processes in the same second might have the wrong order.

Example:

ObjectId("5236dc5c 88ee6f 2075 bd0049")

might have been generate by the process 2075 right before

 ObjectId("5236dc5c 88ee6f 2071 f35fb8")

by the process 2071. Note that the timestamp part of both IDs equal (5236dc5c). This timestamp is given in seconds.

like image 443
davidn Avatar asked Sep 16 '13 13:09

davidn


1 Answers

Using ObjectIds or a date field to sort on may not give you the results you are looking for. ObjectIds and dates in inserted documents are generated client-side, so if you are running with connections from multiple machines you will run into ordering inconsistencies unless the timing between your machines is perfect.

Can you provide more details on what you are trying to do? There are a few different ways to get the behavior you want from MongoDB, depending on why you need a list of documents inserted after a specific document.

If, for instance, you are trying to use that ordered list of documents as a kind of queue, you could instead use a findAndModify command to fetch an unread document and atomically update a “read” field to guarantee you don’t read it twice. Each call to findAndModify would find the most recent document in the collection without a read field set to true, atomically set that field to true, and return your document to the client for processing.

On the other hand, if your use case really does require a list of documents in inserted order, you can exploit the natural ordering of inserted documents. In MongoDB, documents are written to disk in order of insertion unless changes in document size or deletions require things to move around. By using a capped collection, which is guaranteed to maintain natural ordering, you could feasibly get your list of documents by exploiting this. Please note there are several major restrictions that come with using capped collections, which you can find spelled out in the documentation.

like image 92
3rf Avatar answered Sep 20 '22 14:09

3rf