Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using mysql pool on amazon lambda

I am trying to use mysql pool in my NodeJS service that is running on Amazon Lambda. This is the beginning of my module that works with database:

console.log('init database module ...');
var settings = require('./settings.json');
var mysql = require('mysql');
var pool = mysql.createPool(settings);

As following from logs in Amazon console this piece of code is executed very often:

  1. If I just deployed the service and executed 10 requests simultaneously - all these 10 requests execute this piece of code.
  2. If I again execute 10 requests simultaneously immediately after first series - they don't execute this code.
  3. If some time is passed from last query - then some of the requests re-execute that code.

Even if I use global - this decreases but not eliminates duplicates:

if (!global.pool) {
    console.log('init database module ...');
    var settings = require('./settings.json');
    var mysql = require('mysql');
    global.pool = mysql.createPool(settings);
}

Moreover, if request execution has some error - this piece of code is executed after the request and global.pool is null at that moment.

So, does this mean that using pool in Amazon Lambda is not possible? Is there any option how I can make Amazon use the same pool instance every time?

like image 332
Boris Avatar asked Dec 18 '22 04:12

Boris


1 Answers

Each time a Lambda function is invoked, it runs in its own, independent container. If no idle containers are available, a new one is automatically created by the service. Hence:

  1. If I just deployed the service and executed 10 requests simultaneously - all these 10 requests execute this piece of code.

If a container is available, it may be, and very likely will be, reused. When that happens, the process is already running, so the global section doesn't run again -- the invocation starts with the handler. Therefore:

  1. If I again execute 10 requests simultaneously immediately after first series - they don't execute this code.

After each invocation is complete, the container that was used is frozen, and will ultimately be either thawed and reused for a subsequent invocation, or if it isn't needed after a few minutes, it is destroyed. Thus:

  1. If some time is passed from last query - then some of the requests re-execute that code.

Makes sense, now, right?

The only "catch" is that the amount of time that must elapse before a container is destroyed is not a fixed value. Anecdotally, it appears to be about 15 minutes, but I don't believe it's documented, since most likely the timer is adaptive... the service can (at its descretion) use heuristics to predict whether recent activity was a spike or likely to be sustained, and probably considers other factors.

(Lambda@Edge, which is Lambda integrated with CloudFront for HTTP header manipulation, seems to operate with different timing. Idle containers seem to persist much longer, at least in small quantities, but this makes sense because they are always very small containers... and again this observation is anecdotal.)

The global section of your code only runs when a new container is created.

Pooling doesn't make sense because nothing is shared during an invocation -- each invocation is the only one running in its container -- one per process -- at any one time.

What you will want to do, though, is change the idle_timeout on the connections. MySQL Server doesn't have an effective way to "discover" that an idle connection has gone away entirely, so when your connection goes away when the container is destroyed, the server just sits there, and the connection remains in the Sleep state until the default idle_timeout expires. The default is 28800 seconds, or 8 hours, which is too long. You can change this on the server, or send the query SET @@IDLE_TIMEOUT = 900 (though you'll need to experiment with an appropriate value).

Or, you can establish and destroy the connection inside the handler for each invocation. This will take a little bit more time, of course, but it's a sensible approach if your function isn't going to be running very often. The MySQL client/server protocol's connection/handshake sequence is reasonably lightweight, and frequent connect/disconnect doesn't impose as much load on the server as you might expect... although you would not want to do that on an RDS server that uses IAM token authentication, which is more resource-intensive.

like image 193
Michael - sqlbot Avatar answered Jan 06 '23 03:01

Michael - sqlbot