Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Node.JS + mongo: .find().each() stopping after first batch

This has me stumped.

I have a standalone (command-line executed) node script, whose purpose is to iterate through all the documents in a large collection (several hundred thousand of them), and for each document, perform a few calculations, run a little additional JS code, and then update the document with some new values.

Per the documentation for cursor.each(), once I've got my cursor from collection.find(), the .each(cb) method should execute cb(item) on each item in the entire collection.

Example code:

myDb.collection('bigcollection').find().each(function(err, doc) {
    if (err) {
        console.log("Error: " + err);
    } else {
        if (doc != null) {
            process.stdout.write(".");
        } else {
            process.stdout.write("X");
        }
    }
});

What I'd expect this to do is print out several hundred thousand .'s and then print an X at the end, as cursor.each() is supposed to "Iterate over all the documents for this cursor," and per the example code, "If the item is null then the cursor is exhausted/empty and closed."

But what it actually does is print out precisely 101 .'s, without an X at the end.

If I adjust the batch size (.find().batchSize(10).each(...), it goes through exactly that number of documents before bailing.

So, why is it only processing the first batch? Am I somehow misreading the documentation for .each()? Does it have to do with the fact that this is a command-line script, and somehow the whole script is exiting before the second batch of results comes back, or something? If so, how do I make sure it actually processes all the results?

As a side node, I've tried using .stream() and .forEach(), and in both of those cases as well, it ditches after the first batch.

UPDATE: Well, this is interesting. Just tried connecting to my production server instead of my mongo instance on localhost, and voila, it runs through the entire collection like it should. The server is running mongodb 3.0.6, my local instance is 3.2.3. My version of the node mongodb driver is 2.0.43.

like image 702
DanM Avatar asked Apr 04 '16 16:04

DanM


1 Answers

I have 200 documents in my collection and following code goes well. In other words, couldn't reproduce problem. As you can see I have reduced batch size to 10.

var url = 'mongodb://localhost:27017/test';
MongoClient.connect(url, function(err, db) {
    if (err) {
        console.log(err);
    }
    else {
        var counter = 0;
        db.collection('collection').find({}).batchSize(10).each(function(e, r){
            if(err){
                console.log("E: " +  err);
                db.close();
            }
            else{
                if(r ==  null){
                    db.close();
                }
                else{
                counter += 1;
                console.log("X: " +  counter);
                }
            }
        });
    }
});

If you are still facing same issue, I'd suggest to update MongoDB driver to latest version. Since drivers are actively being developed, sometime bugs sneak into released version causing strange behavior.

like image 133
Saleem Avatar answered Oct 06 '22 19:10

Saleem