I'm not sure if I'm understanding the use case for DB connection pools (eg: psycopg2.pool and mysql.connector.pooling) in python. It seems to me that parallelism is usually achieved in python using a multi-process rather than a multi-thread approach because of the GIL, and that in the multi-process case these pools are not very useful since each process will initialize its own pool and will only have a single thread running at a time. Is this correct? Is there any strategy for sharing a DB connection pool when using multiple processes, and if not is the usefulness of pooling limited to multi-threaded python applications or are there other scenarios where you would use them?
DataSource objects that implement connection pooling also produce a connection to the particular data source that the DataSource class represents. The connection object that the getConnection method returns is a handle to a PooledConnection object rather than being a physical connection.
To create a connection between the MySQL database and the python application, the connect() method of mysql. connector module is used. Pass the database details like HostName, username, and the database password in the method call. The method returns the connection object.
A connection pool is created for each unique connection string. When a pool is created, multiple connection objects are created and added to the pool so that the minimum pool size requirement is satisfied. Connections are added to the pool as needed, up to the maximum pool size specified (100 is the default).
Keith,
You're on the right track. As mentioned in the S.O post "Accessing a MySQL connection pool from Python multiprocessing,":
Making a seperate pool for each process is redundant and opens up way
too many connections.
Check out the other S.O post, "What is the best solution for database connection pooling in python?", it contains a sample pooling solution in python. This post also discusses the limitations of db-pooling if your application were to become multi-threaded:
Making your own connection pool is a BAD idea if your app ever decides to start using
multi-threading. Making a connection pool for a multi-threaded application is much
more complicated than one for a single-threaded application. You can use something
like PySQLPool in that case.
In-terms of implementing db pooling in python, as mentioned in "Application vs Database Resident Connection Pool," if your database supports it, the best implementation would involve:
Let connection pool be maintained and managed by database itself
(example: Oracle's DRCP) and calling modules just ask connections from the connection
broker described by Oracle DRCP.
Please let me know if you have any questions!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With