Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Node.js, Express, MongoDB and streams

I am looking for the best way to stream data from MongoDB to my web client via my Node.js server layer. I am requesting about 10MB of data each query and the query is already indexed on day_timestamp. Note, I have already read this post.

The only Mongo related module I am using is as follows (do I need others to achieve my goals?):

MongoClient = require('mongodb').MongoClient;

Currently my code looks something like this:

MongoClient.connect('mongodb://host:port/myDatabase', function(err, db) {
    if(err) throw err;
    console.log("Connected to Database");

    // Server picks up URL requests made by browser
    app.get("/:type/:category/:resolution/:from/:to/", function (req, res){
        var start = moment();

        var type = String(req.params.type)
            ,category = String(req.params.category)
            ,resolution = String(req.params.resolution)
            ,from = moment.utc(req.params.from).toDate()
            ,to = moment.utc(req.params.to).toDate()
            ,options = {
                parse : true, 
                accept : 'application/json'
            };

        res.set('Content-Type', 'application/json'); // Required?
        res.writeHead(200, { 'Content-Type': 'application/json'}); // Required?
        var collection = db.collection(category);
        var stream = collection.find({'day_timestamp':{'$gte':from, '$lte':to}})
            .sort({day_timestamp:1})
            .stream()
            .pipe(JSONStream.stringify())
            .pipe(res)
    });
});

This works, but does not appear to offer any performance gains compared to a 'normal' collection.find() callback nesting res.json(...);

I would like to understand a few things.

Firstly, I want to stream data from MongoDB directly to my Nods.js server ... and as soon as it arrives at my server, stream it directly to the client. If needs be, this can be BSON all the way and I can deserialize it at the client. Is this possible?

Secondly, how would I go about adding in an example for timing this stream performance vs a normal collection.find() callback? With the latter example, I can easily achieve this, but I am not clear how I would do so with the stream example (n.b. stream.end doesn't appear to work as I would expect possible because of the .pipe)

Thirdly, I am trying to get my data from MongoDB to my client as fast as possible and my Node.js does not have to do much (if any) data processing because the data is stored as required in the database. Am I going about things the write way to achieve this goal?

like image 443
jtromans Avatar asked Feb 19 '14 15:02

jtromans


1 Answers

This writes to res every time there's data to write:

var stream = collection.find({'day_timestamp':{'$gte':from, '$lte':to}})
    .sort({day_timestamp:1})
    .stream();

stream.on('data', function(data) {
  res.write(JSON.stringify(data));
});

stream.on('end', function() {
  res.end();
});
like image 70
algoni Avatar answered Sep 21 '22 06:09

algoni