Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to intercept kill signals to close DB connections right before a lambda function is killed and started cold?

To speed up Lambda execution, I am trying to move some parts of my Python code outside the handler function

As per Lambda's documentation:

After a Lambda function is executed, AWS Lambda maintains the Execution Context for some time in anticipation of another Lambda function invocation. In effect, the service freezes the Execution Context after a Lambda function completes, and thaws the context for reuse, if AWS Lambda chooses to reuse the context when the Lambda function is invoked again. This Execution Context reuse approach has the following implications:

Any declarations in your Lambda function code (outside the handler code, see Programming Model) remains initialized, providing additional optimization when the function is invoked again. For example, if your Lambda function establishes a database connection, instead of reestablishing the connection, the original connection is used in subsequent invocations…

Following their example, I have moved my database connection logic outside the handler function so subsequent WARM runs of the function can re-use the connection instead of creating a new one each time the function executes.

However, AWS Lambda provides no guarantees that all subsequent invocations of a function that started COLD will run warm so if Lambda decides a COLD start is necessary, my code would re-create the database connection.

When this happens, I assume the previous (WARM) instance of my function that Lambda teared down would have had an active connection to the database which was never closed and if the pattern kept repeating, I suspect I'd have a lot of orphaned DB connections.

Is there a way in Python to detect if Lambda is trying to kill my function instance (maybe they send a SIGTERM signal?) and have it close active DB connections?

The database I'm using is Postgres.

like image 486
Vinayak Avatar asked May 06 '19 23:05

Vinayak


2 Answers

There is no way to know when a lambda container will be destroyed unfortunately.

With that out of the way, cold boots and DB connections are both very discussed topics using Lambdas. Worst is that there is no definitive answer and should be handled on a use-case basis.

Personally, I think that the best way to go about this is to create connections and kill the idle ones based on a time out postgres side. For that I direct you to How to close idle connections in PostgreSQL automatically?

You might also want to fine tune how many lambdas you have running at any point in time. For this I would recommend setting a concurrency level in your lambda aws-docs. This way you limit the amount of running lambdas and potentially not drown your DB server with connections.

Jeremy Daly(serverless hero) has a great blog post on this. How To: Manage RDS Connections from AWS Lambda Serverless Functions

He also has a project, in node unfortunately, that is a wrapper around the mysql connection. This monitors the connection and automatically manages them like killing zombies serverless-mysql. You might find something similiar for python.

like image 126
Dudemullet Avatar answered Sep 28 '22 01:09

Dudemullet


I dont think what you are looking for is possible at the moment. Hacks might work but I will advice not to depend on them as undocumented things can stop working at any point in time without notice in a closed source system.

I guess you are concerned about the number of new connection created by your lambda functions and the load it puts on the db server.

Have you seen pgbouncer (https://pgbouncer.github.io/) it is one of the famous connection poolers for postgres. I would recommend using something like pgbouncer in between your lambda function and db.

This will remove the load on your db server caused by creation of new connection as connections between pgbouncer and postgres can remain for a long time. The lambda functions can make new connection to pgbouncer which is more than capable of handling un-closed connections with the various timeout config settings.

Update on 9th Dec 2019

AWS recently announced RDS Proxy capable of connection pooling. Currently its in preview and has no support for postresql but they say its coming soon.

https://aws.amazon.com/rds/proxy/

https://aws.amazon.com/blogs/compute/using-amazon-rds-proxy-with-aws-lambda/

like image 30
Josnidhin Avatar answered Sep 28 '22 02:09

Josnidhin