I recall hearing that the connection process in mysql was designed to be very fast compared to other RDBMSes, and that therefore using a library that provides connection pooling (SQLAlchemy) won't actually help you that much if you enable the connection pool.
Does anyone have any experience with this?
I'm leery of enabling it because of the possibility that if some code does something stateful to a db connection and (perhaps mistakenly) doesn't clean up after itself, that state which would normally get cleaned up upon closing the connection will instead get propagated to subsequent code that gets a recycled connection.
Connection pooling is great for scalability - if you have 100 threads/clients/end-users, each of which need to talk to the database, you don't want them all to have a dedicated connection open to the database (connections are expensive resources), but rather to share connections (via pooling).
The purpose of using connection pools is to avoid multiple connections going to a DB. It is very costly to establish the connections and would create a performance problem. By using connection pools, the JDBC driver opens a connection and creates a connection pool that will be used within the Vontu manager framework.
One of the most common issues undermining connection pool benefits is the fact that pooled connections can end up being stale. This most often happens due to inactive connections being timed out by network devices between the JVM and the database. As a result, there will be stale connections in the pool.
Instead of opening and closing connections for every request, connection pooling uses a cache of database connections that can be reused when future requests to the database are required. It lets your database scale effectively as the data stored there and the number of clients accessing it grow.
There's no need to worry about residual state on a connection when using SQLA's connection pool, unless your application is changing connectionwide options like transaction isolation levels (which generally is not the case). SQLA's connection pool issues a connection.rollback() on the connection when its checked back in, so that any transactional state or locks are cleared.
It is possible that MySQL's connection time is pretty fast, especially if you're connecting over unix sockets on the same machine. If you do use a connection pool, you also want to ensure that connections are recycled after some period of time as MySQL's client library will shut down connections that are idle for more than 8 hours automatically (in SQLAlchemy this is the pool_recycle option).
You can quickly do some benching of connection pool vs. non with a SQLA application by changing the pool implementation from the default of QueuePool to NullPool, which is a pool implementation that doesn't actually pool anything - it connects and disconnects for real when the proxied connection is acquired and later closed.
Even if the connection part of MySQL itself is pretty slick, presumably there's still a network connection involved (whether that's loopback or physical). If you're making a lot of requests, that could get significantly expensive. It will depend (as is so often the case) on exactly what your application does, of course - if you're doing a lot of work per connection, then that will dominate and you won't gain a lot.
When in doubt, benchmark - but I would by-and-large trust that a connection pooling library (at least, a reputable one) should work properly and reset things appropriately.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With