I have a MongoDB collection with ~4M elements.
I want to grab X number of those elements, evenly spaced through the entire collection.
E.g., Get 1000 elements from the collection - one every 4000 rows.
Right now, I am getting the whole collection in a cursor and then only writing every Nth element. This gives me what I need but the original load of the huge collection takes a long time.
Is there an easy way to do this? Right now my guessed approach is to do a JS query on an incremented index property, with a modulus. A PHP implementation of this:
db.collection.find({i:{$mod:[10000,0]}})
But this seems like it will probably take just as much time for the query to run.
Jer
Use $sample.
This returns a random sample that is roughly "every Nth document".
To receive exactly every Nth document in a result set, you would have to provide a sort order and iterate the entire result set discarding all unwanted documents in your application.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With