Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error while using pymysql in flask

I am using pymysql client to connect to mysql in my flask API, everything works fine for some days(around 1-2 days) after that suddenly it starts to throw this error

Traceback (most recent call last):
  File "/usr/local/lib/python3.4/dist-packages/pymysql/connections.py", line 1039, in _write_bytes
    self._sock.sendall(data)
TimeoutError: [Errno 110] Connection timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "Main.py", line 194, in post
    result={'resultCode':100,'resultDescription':'SUCCESS','result':self.getStudentATData(studentId,args['chapterId'])}
  File "Main.py", line 176, in getStudentATData
    cur.execute("my query")
  File "/usr/local/lib/python3.4/dist-packages/pymysql/cursors.py", line 166, in execute
    result = self._query(query)
  File "/usr/local/lib/python3.4/dist-packages/pymysql/cursors.py", line 322, in _query
    conn.query(q)
  File "/usr/local/lib/python3.4/dist-packages/pymysql/connections.py", line 855, in query
    self._execute_command(COMMAND.COM_QUERY, sql)
  File "/usr/local/lib/python3.4/dist-packages/pymysql/connections.py", line 1092, in _execute_command
    self._write_bytes(packet)
  File "/usr/local/lib/python3.4/dist-packages/pymysql/connections.py", line 1044, in _write_bytes
    "MySQL server has gone away (%r)" % (e,))
pymysql.err.OperationalError: (2006, "MySQL server has gone away (TimeoutError(110, 'Connection timed out'))")

And if restart the application it again works fine, i have tried everything but cant seem to get away with this, can anyone help? As suggested, i am implemented a retry kind of mechanism but that didn't solve the problem

def connect(self):
        #db connect here
    def cursor(self):
        try:
            cursor = self.conn.cursor()
        except Exception as e:
            print(e)
            self.connect()
            cursor = self.conn.cursor()
        return cursor

And consuming it like DB().cursor()

like image 876
Bhargav Avatar asked Dec 08 '17 09:12

Bhargav


3 Answers

First of all, you need to decide whether you want to maintain a persistent connection to MySQL. The latter performs better, but need a little maintenance.

Default wait_timeout in MySQL is 8 hours. Whenever a connection is idle longer than wait_timeout it's closed. When MySQL server is restarted, it also closes all established connections. Thus if you use a persistent connection, you need to check before using a connection if it's alive (and if not, reconnect). If you use per request connection, you don't need to maintain the state of connection, because connection are always fresh.

Per request connection

A non-persistent database connection has evident overhead of opening connection, handshaking, and so on (for both database server and client) per each incoming HTTP request.

Here's a quote from Flask's official tutorial regarding database connections:

Creating and closing database connections all the time is very inefficient, so you will need to keep it around for longer. Because database connections encapsulate a transaction, you will need to make sure that only one request at a time uses the connection. An elegant way to do this is by utilizing the application context.

Note, however, that application context is initialised per request (which is kind of veiled by efficiency concerns and Flask's lingo). And thus, it's still very inefficient. However it should solve your issue. Here's stripped snippet of what it suggests as applied to pymysql:

import pymysql
from flask import Flask, g, request    

app = Flask(__name__)    

def connect_db():
    return pymysql.connect(
        user = 'guest', password = '', database = 'sakila', 
        autocommit = True, charset = 'utf8mb4', 
        cursorclass = pymysql.cursors.DictCursor)

def get_db():
    '''Opens a new database connection per request.'''        
    if not hasattr(g, 'db'):
        g.db = connect_db()
    return g.db    

@app.teardown_appcontext
def close_db(error):
    '''Closes the database connection at the end of request.'''    
    if hasattr(g, 'db'):
        g.db.close()    

@app.route('/')
def hello_world():
    city = request.args.get('city')

    cursor = get_db().cursor()
    cursor.execute('SELECT city_id FROM city WHERE city = %s', city)
    row = cursor.fetchone()

    if row:
      return 'City "{}" is #{:d}'.format(city, row['city_id'])
    else:
      return 'City "{}" not found'.format(city)

Persistent connection

For a persistent connection database connection there are two major options. Either you have a pool of connections or map connections to worker processes. Because normally Flask WSGI applications are served by threaded servers with fixed number of threads (e.g. uWSGI), thread-mapping is easier and as efficient.

There's a package, DBUtils, which implements both, and PersistentDB for thread-mapped connections.

One important caveat in maintaining a persistent connection is transactions. The API for reconnection is ping. It's safe for auto-committing single-statements, but it can be disrupting in between a transaction (a little more details here). DBUtils takes care of this, and should only reconnect on dbapi.OperationalError and dbapi.InternalError (by default, controlled by failures to initialiser of PersistentDB) raised outside of a transaction.

Here's how the above snippet will look like with PersistentDB.

import pymysql
from flask import Flask, g, request
from DBUtils.PersistentDB import PersistentDB    

app = Flask(__name__)    

def connect_db():
    return PersistentDB(
        creator = pymysql, # the rest keyword arguments belong to pymysql
        user = 'guest', password = '', database = 'sakila', 
        autocommit = True, charset = 'utf8mb4', 
        cursorclass = pymysql.cursors.DictCursor)

def get_db():
    '''Opens a new database connection per app.'''

    if not hasattr(app, 'db'):
        app.db = connect_db()
    return app.db.connection()    

@app.route('/')
def hello_world():
    city = request.args.get('city')

    cursor = get_db().cursor()
    cursor.execute('SELECT city_id FROM city WHERE city = %s', city)
    row = cursor.fetchone()

    if row:
      return 'City "{}" is #{:d}'.format(city, row['city_id'])
    else:
      return 'City "{}" not found'.format(city)

Micro-benchmark

To give a little clue what performance implications are in numbers, here's micro-benchmark.

I ran:

  • uwsgi --http :5000 --wsgi-file app_persistent.py --callable app --master --processes 1 --threads 16
  • uwsgi --http :5000 --wsgi-file app_per_req.py --callable app --master --processes 1 --threads 16

And load-tested them with concurrency 1, 4, 8, 16 via:

siege -b -t 15s -c 16 http://localhost:5000/?city=london

chart

Observations (for my local configuration):

  1. A persistent connection is ~30% faster,
  2. On concurrency 4 and higher, uWSGI worker process peaks at over 100% of CPU utilisation (pymysql has to parse MySQL protocol in pure Python, which is the bottleneck),
  3. On concurrency 16, mysqld's CPU utilisation is ~55% for per-request and ~45% for persistent connection.
like image 191
saaj Avatar answered Nov 18 '22 01:11

saaj


As I see, you have two options:

  • Create new connection for every query, then close it. Like this:

    def db_execute(query):
         conn = MySQLdb.connect(*)
         cur = conn.cursor()
         cur.execute(query)
         res = cur.fetchall()
         cur.close()
         conn.close()
         return res
    
  • Better way is to use Connection Pool like SqlAlchemy.pool with pool_pre_ping argument and custom connection function.
like image 4
nndii Avatar answered Nov 18 '22 02:11

nndii


I don't believe this is an issue with Flask/pymysql as much as it is a symptom of your MySQL timeout configurations. I assume this is some sort of cloud database instance?

Have a look at this:

https://dba.stackexchange.com/questions/1558/how-long-is-too-long-for-mysql-connections-to-sleep

And I'd post your question there with specifics about your setup and you may be able to get a configuration answer.

A Python solution would be to use something like sqlalchemy & flask-sqlalchemy then set the configuration variable SQLALCHEMY_POOL_RECYCLE = 3600 to recycle connections after an hour (or whatever value you desire). Alternatively if you don't want to add that much bulk to your project you could implement a connection "timer" feature to recycle the connection yourself in the background:

from datetime import datetime, timedelta

class MyConnectionPool(object)
    """Class that returns a database connection <= 1 hour old"""
    refresh_time = timedelta(hours=1)

    def __init__(self, host, user, pass):
        self.host = host
        self.user = user
        self.pass = pass

        self.db = self._get_connection()

    @property
    def connection(self):

        if self.refresh <= datetime.now():
            self.db = self._get_connection()

        return self.db

    def _get_connection(self):
        self.refresh = datetime.now() + self.refresh_time
        return pymysql.connect(
            host=self.host,
            user=self.user,
            passwd=self.pass
        )
like image 2
abigperson Avatar answered Nov 18 '22 03:11

abigperson