Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS Lambda function that executes 5000+ promises to AWS SQS is extremely unreliable

I'm writing a Node AWS Lambda function that queries around 5,000 items from my DB and sends them via messages into an AWS SQS queue.

My local environment involves me running my lambda with AWS SAM local, and emulating AWS SQS with GoAWS.

An example skeleton of my Lambda is:

async run() {
  try {
    const accounts = await this.getAccountsFromDB();
    const results = await this.writeAccountsIntoQueue(accounts);
    return 'I\'ve written: ' + results + ' messages into SQS';
  } catch (e) {
    console.log('Caught error running job: ');
    console.log(e);
    return e;
  }
}

There are no performance issues with my getAccountsFromDB() function and it runs almost instantly, returning me an array of 5,000 accounts.

My writeAccountsIntoQueue function looks like:

async writeAccountsIntoQueue(accounts) {
  // Extract the sqsClient and queueUrl from the class 
  const { sqsClient, queueUrl } = this;
  try {
    // Create array of functions to concurrenctly call later
    let promises = accounts.map(acc => async () => await sqsClient.sendMessage({
        QueueUrl: queueUrl,
        MessageBody: JSON.stringify(acc),
        DelaySeconds: 10,
      })
    );

    // Invoke the functions concurrently, using helper function `eachLimit`
    let writtenMessages = await eachLimit(promises, 3);
    return writtenMessages;
  } catch (e) {
    console.log('Error writing accounts into queue');
    console.log(e);
    return e;
  }
}

My helper, eachLimit looks like:

async function eachLimit (funcs, limit) {
  let rest = funcs.slice(limit);
  await Promise.all(
    funcs.slice(0, limit).map(
      async (func) => {
        await func();
        while (rest.length) {
          await rest.shift()();
        }
      }
    )
  );
}

To the best of my understanding, it should be limiting concurrent executions to limit.

Additionally, I've wrapped the AWS SDK SQS client to return an object with a sendMessage function that looks like:

sendMessage(params) {
  const { client } = this;
  return new Promise((resolve, reject) => {
    client.sendMessage(params, (err, data) => {
      if (err) {
        console.log('Error sending message');
        console.log(err);
        return reject(err);
      }
      return resolve(data);
    });
  });
}

So nothing fancy there, just Promisifying a callback.

I've got my lambda set up to timeout after 300 seconds, and the lambda always times out, and if it doesn't it ends abruptly and misses some final logging that should go on, which makes me thing it may even be erroring somewhere, silently. When I check the SQS queue I'm missing around 1,000 entries.

like image 404
Zach Avatar asked Dec 19 '22 01:12

Zach


1 Answers

I can see a couple of issues in your code,

First:

    let promises = accounts.map(acc => async () => await sqsClient.sendMessage({
        QueueUrl: queueUrl,
        MessageBody: JSON.stringify(acc),
        DelaySeconds: 10,
      })
    );

You're abusing async / await. Always bear in mind await will wait until your promise is resolved before continuing with the next one, in this case whenever you map the array promises and call each function item it will wait for the promise wrapped by that function before continuing, which is bad. Since you're only interested in getting the promises back, you could simply do this instead:

const promises = accounts.map(acc => () => sqsClient.sendMessage({
       QueueUrl: queueUrl,
       MessageBody: JSON.stringify(acc),
       DelaySeconds: 10,
    })
);

Now, for the second part, your eachLimit implementation looks wrong and very verbose, I've refactored it with help of es6-promise-pool to handle the concurrency limit for you:

const PromisePool = require('es6-promise-pool')

function eachLimit(promiseFuncs, limit) {    
    const promiseProducer = function () {
        while(promiseFuncs.length) {
            const promiseFunc = promiseFuncs.shift();
            return promiseFunc();
        }

        return null;
    }

    const pool = new PromisePool(promiseProducer, limit)
    const poolPromise = pool.start();
    return poolPromise;
}

Lastly, but very important, have a look at SQS Limits, SQS FIFO has up to 300 sends / sec. Since you are processing 5k items, you could probably up your concurrency limit to 5k / (300 + 50) , approx 15. The 50 could be any positive number, just to move away from the limit a bit. Also, considering using SendMessageBatch which you could have much more throughput and reach 3k sends / sec.

EDIT

As I suggested above, using sendMessageBatch the throughput is much better, so I've refactored the code mapping your promises to support sendMessageBatch:

function chunkArray(myArray, chunk_size){
    var index = 0;
    var arrayLength = myArray.length;
    var tempArray = [];

    for (index = 0; index < arrayLength; index += chunk_size) {
        myChunk = myArray.slice(index, index+chunk_size);
        tempArray.push(myChunk);
    }

    return tempArray;
}

const groupedAccounts = chunkArray(accounts, 10);

const promiseFuncs = groupedAccounts.map(accountsGroup => {
    const messages = accountsGroup.map((acc,i) => {
        return {
            Id: `pos_${i}`,
            MessageBody: JSON.stringify(acc),
            DelaySeconds: 10
        }
    });

    return () => sqsClient.sendMessageBatch({
        Entries: messages,
        QueueUrl: queueUrl
     })
});

Then you can call eachLimit as usual:

const result = await eachLimit(promiseFuncs, 3);

The difference now is every promise processed will send a batch of messages of size n (10 in the example above).

like image 105
Daniel Conde Marin Avatar answered Dec 20 '22 16:12

Daniel Conde Marin