Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

convert mongoose stream to array

I have worked with mongodb but quite new to mongoose ORM. I was trying to fetch data from a collection and the explain() output was showing 50ms. the overall time it was taking to fetch the data via mongoose was 9 seconds. Here is the query:

Node.find({'dataset': datasetRef}, function (err, nodes){
   // handle error and data here
});

Then I applied index on the field I was querying on. The explain() output now showed 4ms. But the total time to retrieve data via mongoose did not change. Then i searched a bit and found that using lean() can help bring the performance of read queries in mongoose quite close to native mongodb

So I changed my query to:

Node.find({'dataset': datasetRef})
.lean()
.stream({transform: JSON.stringify})
.pipe(res)

This solved the performance issues completely. But the end result is a stream of JSON docs like this:

{var11: val11, var12: val12}{var21: val21, var22: val22} ...

How do I parse this to form an array of docs ? Or should I not be using stream at all ? In my opinion, there is no point using a stream if I am planning to form the array at backend, since I will then have to wait for all the docs to be read into memory. But I also think that parsing and creating the whole array at front end might be costly.

How can I achieve best performance in this case without clogging the network as well ?

UPDATE

I am trying to solve this problem using a through stream. However, I am not able to insert commas in between the JSON objects yet. See the code below:

res.write("[");

var through = require('through');
var tr = through(
  function write(data){
    this.queue(data.replace(/\}\{/g,"},{"));
  }
);

var dbStream = db.node.find({'dataset': dataSetRef})
.lean()
.stream({'transform': JSON.stringify});

dbStream.on("end", function(){
    res.write("]");
});

dbStream
.pipe(tr)
.pipe(res);

With this, I am able to get the "[" in the beginning and "]" at the end. However, still not able to get patten "}{" replaced with "},{". Not sure what am I doing wrong

UPDATE 2

Now figured out why the replace is not working. It appears that since I have specified the transform function as JSON.stringify, it reads one JSON object at a time and hence never encounter the pattern }{ since it never picks multiple JSON elements at a time.

Now I have modified my code, and written a custom transform function which does JSON.stringify and then appends a comma at the end. The only problem I am facing here is that I don't know when it is the last JSON object in the stream. Because I don't wanna append the comma in that case. At the moment, I append an empty JSON object once the end is encountered. But somehow this does not look like a convincing idea. Here is the code:

res.write("[");
function transform(data){
    return JSON.stringify(data) + ",";
}

var dbStream = db.node.find({'dataset': dataSetRef})
.lean()
.stream({'transform': transform});

dbStream.on("end", function(){
    res.write("{}]");
});

dbStream
.pipe(res);
like image 439
Mandeep Singh Avatar asked Jan 30 '15 08:01

Mandeep Singh


3 Answers

The only problem I am facing here is that I don't know when it is the last JSON object in the stream.

But you do know which one is first. Knowing that, instead of appending the comma, you can prepend it to every object except the first one. In order to do that, set up your transform function inside a closure:

function transformFn(){

    var first = true;

    return function(data) {

        if (first) {

            first = false;
            return JSON.stringify(data);
        }
        return "," + JSON.stringify(data);
    }
}

Now you can just call that function and set it as your actual transform.

var transform = transformFn();
res.write("[");
var dbStream = db.node.find({'dataset': dataSetRef})
.lean()
.stream({'transform': transform});

dbStream.on("end", function(){
    res.write("]");
});

dbStream
.pipe(res);
like image 177
chrisbajorin Avatar answered Nov 12 '22 07:11

chrisbajorin


@cbajorin and @rckd both gave correct answers.

However, repeating this code all the time seems like a pain.

Hence my solution uses an extra Transform stream to achieve the same thing.

    import { Transform } from 'stream'

class ArrayTransform extends Transform {
    constructor(options) {
        super(options)
        this._index = 0
    }

    _transform(data, encoding, done) {
        if (!(this._index++)) {
            // first element, add opening bracket
            this.push('[')
        } else {
            // following element, prepend comma
            this.push(',')
        }
        this.push(data)
        done()
    }

    _flush(done) {
        if (!(this._index++)) {
            // empty
            this.push('[]')
        } else {
            // append closing bracket
            this.push(']')
        }
        done()
    }
}

Which in turn can be used as:

const toArray = new ArrayTransform();
Model.find(query).lean().stream({transform: JSON.stringify })
    .pipe(toArray)
    .pipe(res)

EDIT: added check for empty

like image 25
pixeleet Avatar answered Nov 12 '22 07:11

pixeleet


I love @cdbajorin's solution, so i created a more readable version of it (ES6):

Products
    .find({})
    .lean()
    .stream({
        transform: () => {
            let index = 0;
            return (data) => {
                return (!(index++) ? '[' : ',') + JSON.stringify(data);
            };
        }() // invoke
    })
    .on('end', () => {
        res.write(']');
    })
    .pipe(res);
like image 1
rckd Avatar answered Nov 12 '22 05:11

rckd