Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Batch update in knex

I'd like to perform a batch update using Knex.js

For example:

'UPDATE foo SET [theValues] WHERE idFoo = 1'
'UPDATE foo SET [theValues] WHERE idFoo = 2'

with values:

{ name: "FooName1", checked: true } // to `idFoo = 1`
{ name: "FooName2", checked: false } // to `idFoo = 2`

I was using node-mysql previously, which allowed multiple-statements. While using that I simply built a mulitple-statement query string and just send that through the wire in a single run.

I'm not sure how to achieve the same with Knex. I can see batchInsert as an API method I can use, but nothing as far as batchUpdate is concerned.

Note:

  • I can do an async iteration and update each row separately. That's bad cause it means there's gonna be lots of roundtrips from the server to the DB

  • I can use the raw() thing of Knex and probably do something similar to what I do with node-mysql. However that defeats the whole knex purpose of being a DB abstraction layer (It introduces strong DB coupling)

So I'd like to do this using something "knex-y".

Any ideas welcome.

like image 740
nicholaswmin Avatar asked Nov 11 '16 08:11

nicholaswmin


3 Answers

I needed to perform a batch update inside a transaction (I didn't want to have partial updates in case something went wrong). I've resolved it the next way:

// I wrap knex as 'connection'
return connection.transaction(trx => {
    const queries = [];
    users.forEach(user => {
        const query = connection('users')
            .where('id', user.id)
            .update({
                lastActivity: user.lastActivity,
                points: user.points,
            })
            .transacting(trx); // This makes every update be in the same transaction
        queries.push(query);
    });

    Promise.all(queries) // Once every query is written
        .then(trx.commit) // We try to execute all of them
        .catch(trx.rollback); // And rollback in case any of them goes wrong
});
like image 93
A Bravo Dev Avatar answered Nov 17 '22 04:11

A Bravo Dev


You have a good idea of the pros and cons of each approach. I would recommend a raw query that bulk updates over several async updates. Yes you can run them in parallel, but your bottleneck becomes the time it takes for the db to run each update. Details can be found here.

Below is an example of an batch upsert using knex.raw. Assume that records is an array of objects (one obj for each row we want to update) whose values are the properties names line up with the columns in the database you want to update:

var knex = require('knex'),
    _ = require('underscore');

function bulkUpdate (records) {
      var updateQuery = [
          'INSERT INTO mytable (primaryKeyCol, col2, colN) VALUES',
          _.map(records, () => '(?)').join(','),
          'ON DUPLICATE KEY UPDATE',
          'col2 = VALUES(col2),',
          'colN = VALUES(colN)'
      ].join(' '),

      vals = [];

      _(records).map(record => {
          vals.push(_(record).values());
      });

      return knex.raw(updateQuery, vals);
 }

This answer does a great job explaining the runtime relationship between the two approaches.

Edit:

It was requested that I show what records would look like in this example.

var records = [
  { primaryKeyCol: 123, col2: 'foo', colN: 'bar' },
  { // some other record, same props }
];

Please note that if your record has additional properties than the ones you specified in the query, you cannot do:

  _(records).map(record => {
      vals.push(_(record).values());
  });

Because you will hand too many values to the query per record and knex will fail to match the property values of each record with the ? characters in the query. You instead will need to explicitly push the values on each record that you want to insert into an array like so:

  // assume a record has additional property `type` that you dont want to
  // insert into the database
  // example: { primaryKeyCol: 123, col2: 'foo', colN: 'bar', type: 'baz' }
  _(records).map(record => {
      vals.push(record.primaryKeyCol);
      vals.push(record.col2);
      vals.push(record.colN);
  });

There are less repetitive ways of doing the above explicit references, but this is just an example. Hope this helps!

like image 37
Patrick Motard Avatar answered Nov 17 '22 04:11

Patrick Motard


Assuming you have a collection of valid keys/values for the given table:

// abstract transactional batch update
function batchUpdate(table, collection) {
  return knex.transaction(trx => {
    const queries = collection.map(tuple =>
      knex(table)
        .where('id', tuple.id)
        .update(tuple)
        .transacting(trx)
    );
    return Promise.all(queries)
      .then(trx.commit)    
      .catch(trx.rollback);
  });
}

To call it

batchUpdate('user', [...]);

Are you unfortunately subject to non-conventional column names? No worries, I got you fam:

function batchUpdate(options, collection) {
  return knex.transaction(trx => {
    const queries = collection.map(tuple =>
      knex(options.table)
        .where(options.column, tuple[options.column])
        .update(tuple)
        .transacting(trx)
    );
    return Promise.all(queries)
      .then(trx.commit)    
      .catch(trx.rollback);
  });
}

To call it

batchUpdate({ table: 'user', column: 'user_id' }, [...]);

Modern Syntax Version:

const batchUpdate = (options, collection) => {
  const { table, column } = options;
  const trx = await knex.transaction(); 
  try {
    await Promise.all(collection.map(tuple => 
      knex(table)
        .where(column, tuple[column])
        .update(tuple)
        .transacting(trx)
      )
    );
    await trx.commit();
  } catch (error) {
    await trx.rollback();
  }
}
like image 26
jiminikiz Avatar answered Nov 17 '22 06:11

jiminikiz