Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Serverless Database Connection Pooling

I’m trying to build an application on aws that is 100% serverless (minus the database for now) and what I’m running into is that the database is the bottleneck. My application can scale very well but my database has a finite number of connections it can accommodate and at some point, my lambdas will run into that limit. I can do connection pooling outside of the handler in my lambdas so that there is a database connection per lambda container instead of per invocation and while that does increase the number of concurrent invocations before I hit my connection limit, the limit still exists.

I have two questions. 1. Does serverless aurora solve this by autoscaling to increase the number of instances to meet the need for more connections. 2. Are there any other solutions to this problem?

Also, from other developers interested in serverless, am I trying to do something that’s not worth doing? I love how easy deployment is with serverless framework but is it better just to work with Microservices in something like Kubernetes instead?

like image 813
Matt Clevenger Avatar asked Oct 29 '22 02:10

Matt Clevenger


1 Answers

I believe there are two potential solutions to that problem:

The first and the simplest option is to take advantage of "lambda hot state", it's the concept when Lambda reuses the execution context for subsequent invocations. As per AWS suggestion

Any declarations in your Lambda function code (outside the handler code, see Programming Model) remains initialized, providing additional optimization when the function is invoked again. For example, if your Lambda function establishes a database connection, instead of reestablishing the connection, the original connection is used in subsequent invocations. We suggest adding logic in your code to check if a connection exists before creating one.

Basically, while the lambda function is the hot stage it "might/should" reuse opened connection(s).

The limitations of the following:

  • you only reuse connection for single lambda type, so if you have 5 lambda functions invoked all the time you still will be using 5 connections
  • when you have a spike in lambda invocations, including parallel executions this approach becomes less effective since, lambda will be executed in a new execution context for majority of requests

The second option would be to use a connection pool, connection pool is an array of established database connections, so that the connections can be reused when future requests to the database are required.

While the second option provides a more consistent solution it requires much more infrastructure.

  • you would be required to run a separate instance for the pool, and if you want to do things properly probably at least two instances and a load balancer (unless use containers).

While it might be overwhelming to provision that much additional infrastructure for connection pooler, it still might be a valid option depending on the scale of the project, your existing infrastructure (may be you already using containers) and cost benefits

like image 131
b.b3rn4rd Avatar answered Dec 03 '22 01:12

b.b3rn4rd