Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Memory issues on knex.js when using streams

I'm trying to export a whole sqlite3 database table to CSV using knex.js. As the table can up to 300000 rows, i use streams to don't have memory issues. But if i look the memory usage of my app it up to 800MB or i have an "out of memory" error.

How can i handle a large query result with knex.js on sqlite3 database?

Below a sample of code :

knex.select().from(table).stream(function (stream) {
    var stringifier = stringify(opts);
    var fileStream = fs.createWriteStream(file);

    var i = 0;
    stringifier.on('readable', function() {
      var row;
      while (row = stringifier.read()) {
        fileStream.write(row);
        console.log("row " + i++); //debug
      }
    });

    fileStream.once('open', function(fd) {
      stream.pipe(stringifier);
    });
});

EDIT

Seems knex.js streams for sqlite3 database are "fake" streams. Below the source code of the stream function for sqlite3 in knex :

Runner_SQLite3.prototype._stream = Promise.method(function(sql, stream, options) {
    /*jshint unused: false*/
    var runner = this;
    return new Promise(function(resolver, rejecter) {
        stream.on('error', rejecter);
        stream.on('end', resolver);
        return runner.query(sql).map(function(row) {
            stream.write(row);
        }).catch(function(err) {
            stream.emit('error', err);
        }).then(function() {
            stream.end();
        });
    });
});

We see that it waits for the request to be executed before create the stream from the result array.

VERSION:

  • Knex.Js 0.7.5
  • node 0.12

Thx for your help.

like image 609
Durden Avatar asked Mar 24 '15 15:03

Durden


1 Answers

I think there are no solutions. I use limit and offset to get all data step by step with knex.js and i write each row in a write stream. An implementation example for those who wants :

 exportTable: function(table, writeStream) {
    var totalRows;
    var rowLimit = _config.ROW_LIMIT;

    return DatabaseManager.countAll(table).then(function(count) {

        totalRows = count[0]['count(*)'];
        var iterations = new Array(Math.ceil(totalRows / rowLimit));

        return Promise.reduce(iterations, function(total, item, index) {

            return _knex.select().from(table).limit(rowLimit).offset(index * rowLimit).map(function(row) {
                writeStream.write(row);
            }).catch(function(err) {
                return Promise.reject(err);
            });

        }, 0).then(function() {
            return Promise.resolve();
        }).catch(function(err) {
            return Promise.reject(err);
        });

    }).catch(function(err) {
        console.log(err);
        return Promise.reject(err);
    });
}
like image 175
Durden Avatar answered Nov 10 '22 02:11

Durden