Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

sql.DB on aws-lambda too many connection

As I understand in Golang: the DB handle is meant to be long-lived and shared between many goroutines.

But when I using Golang with AWS lambda, it's a very different story since lambdas are stopping the function when it's finished.

I am using:defer db.Close() in Lambda Invoke function but it isn't affect. On MySQL, it's still keep that connection as Sleep query. As a result, it causes too many connections on MySQL.

Currently, I have to set wait_timeout in MySQL to small number. But it's not the best solution, in my opinion.

Is there any way to close connection when using Go SQL driver with Lambda?

Thanks,

like image 906
nguyenhoai890 Avatar asked Jan 09 '19 12:01

nguyenhoai890


People also ask

How many DB connections can RDS handle?

By default, you can have up to a total of 40 DB instances. RDS DB instances, Aurora DB instances, Amazon Neptune instances, and Amazon DocumentDB instances apply to this quota.

How do I increase my RDS connection limit?

You can increase the maximum number of connections to your RDS for MySQL or RDS for PostgresSQL DB instance using the following methods: Set a larger value for the max_connections parameter using a custom instance-level parameter group. Increasing the max_connections parameter doesn't cause any outage.

What causes MySQL too many connections?

The MySQL “Too many connections” error occurs when more queries are sent to a MySQL database than can be processed. The error can be fixed by setting a new number of maximum connections in the configuration file or globally.


1 Answers

There are two problems that we need to address

  • Correctly managing state between lambda invocations
  • Configuring a connection pool

Correctly managing state

Let us understand a bit of how the container is managed by AWS. From the AWS docs:

After a Lambda function is executed, AWS Lambda maintains the execution context for some time in anticipation of another Lambda function invocation. In effect, the service freezes the execution context after a Lambda function completes, and thaws the context for reuse, if AWS Lambda chooses to reuse the context when the Lambda function is invoked again. This execution context reuse approach has the following implications:

  • Any declarations in your Lambda function code (outside the handler code, see Programming Model) remains initialized, providing additional optimization when the function is invoked again. For example, if your Lambda function establishes a database connection, instead of reestablishing the connection, the original connection is used in subsequent invocations. We suggest adding logic in your code to check if a connection exists before creating one.

  • Each execution context provides 500MB of additional disk space in the /tmp directory. The directory content remains when the execution context is frozen, providing transient cache that can be used for multiple invocations. You can add extra code to check if the cache has the data that you stored. For information on deployment limits, see AWS Lambda Limits.

  • Background processes or callbacks initiated by your Lambda function that did not complete when the function ended resume if AWS Lambda chooses to reuse the execution context. You should make sure any background processes or callbacks (in case of Node.js) in your code are complete before the code exits.

This first bullet point says that state is maintained between executions. Let us see this in action:

let counter = 0

module.exports.handler = (event, context, callback) => {
  counter++
  callback(null, { count: counter })
}

If you deploy this and call multiple times consecutively you will see that the counter will be incremented between calls.

Now that you know that - you should not call defer db.Close(), instead you should be reusing the database instance. You can do that by simply making db a package level variable.

First, create a database package that will export an Open function:

package database

import (
    "fmt"
    "os"

    _ "github.com/go-sql-driver/mysql"
    "github.com/jinzhu/gorm"
)

var (
    host = os.Getenv("DB_HOST")
    port = os.Getenv("DB_PORT")
    user = os.Getenv("DB_USER")
    name = os.Getenv("DB_NAME")
    pass = os.Getenv("DB_PASS")
)

func Open() (db *gorm.DB) {
    args := fmt.Sprintf("%s:%s@tcp(%s:%s)/%s?parseTime=true", user, pass, host, port, name)
    // Initialize a new db connection.
    db, err := gorm.Open("mysql", args)
    if err != nil {
        panic(err)
    }
    return
}

Then use it on your handler.go file:

package main

import (
    "context"

    "github.com/aws/aws-lambda-go/events"
    "github.com/aws/aws-lambda-go/lambda"
    "github.com/jinzhu/gorm"
    "github.com/<username>/<name-of-lib>/database"
)

var db *gorm.DB

func init() {
    db = database.Open()
}

func Handler() (events.APIGatewayProxyResponse, error) {
    // You can use db here.
    return events.APIGatewayProxyResponse{
        StatusCode: 201,
    }, nil
}

func main() {
    lambda.Start(Handler)
}

OBS: don't forget to replace github.com/<username>/<name-of-lib>/database with the right path.

Now, you might still see the too many connections error. If that happens you will need a connection pool.

Configuring a connection pool

From Wikipedia:

In software engineering, a connection pool is a cache of database connections maintained so that the connections can be reused when future requests to the database are required. Connection pools are used to enhance the performance of executing commands on a database.

You will need a connection pool that the number of allowed connections must be equal to the number of parallel lambdas running, you have two choices:

  • MySQL Proxy

MySQL Proxy is a simple program that sits between your client and MySQL server(s) and that can monitor, analyze or transform their communication. Its flexibility allows for a wide variety of uses, including load balancing, failover, query analysis, query filtering and modification, and many more.

  • AWS Aurora:

Amazon Aurora Serverless is an on-demand, auto-scaling configuration for Amazon Aurora (MySQL-compatible edition), where the database will automatically start up, shut down, and scale capacity up or down based on your application's needs. It enables you to run your database in the cloud without managing any database instances. It's a simple, cost-effective option for infrequent, intermittent, or unpredictable workloads.

Regardless of your choice, there are plenty of tutorials on the internet on how to configure both.

like image 85
celicoo Avatar answered Oct 09 '22 02:10

celicoo