We had an outage event recently where the application threads got stuck while retrieving connections from c3p0. The configuration set is the following:
Version of c3p0 used: 0.9.1.2
During a normal scenario everything works fine and c3p0 has been serving us well. However, during a recent network event (network partitioning - where application hosts could not talk to the database), we saw that applications were indefinitely stuck on trying to get connections from c3p0.
Stacktrace seen in logs:
Caused by: java.sql.SQLException: An attempt by a client to checkout a Connection has timed out.
at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:106)
at com.mchange.v2.sql.SqlUtils.toSQLException(SqlUtils.java:65)
at com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool.checkoutPooledConnection(C3P0PooledConnectionPool.java:527)
at com.mchange.v2.c3p0.impl.AbstractPoolBackedDataSource.getConnection(AbstractPoolBackedDataSource.java:128)
at amazon.identity.connection.WrappedDataSource.getConnectionWithOptionalCredentials(WrappedDataSource.java:42)
at amazon.identity.connection.LoggingDataSource.getConnectionWithOptionalCredentials(LoggingDataSource.java:55)
at amazon.identity.connection.WrappedDataSource.getConnection(WrappedDataSource.java:30)
at amazon.identity.connection.WrappedDataSource.getConnectionWithOptionalCredentials(WrappedDataSource.java:42)
at amazon.identity.connection.ConnectionProfilingDataSource.profileGetConnectionWithOptionalCredentials(ConnectionProfilingDataSource.java:118)
at amazon.identity.connection.ConnectionProfilingDataSource.getConnectionWithOptionalCredentials(ConnectionProfilingDataSource.java:99)
at amazon.identity.connection.WrappedDataSource.getConnection(WrappedDataSource.java:30)
at amazon.identity.connection.CallCountTrackingDataSource.getConnectionWithOptionalCredentials(CallCountTrackingDataSource.java:82)
at amazon.identity.connection.WrappedDataSource.getConnection(WrappedDataSource.java:30)
at com.amazon.jdbc.FailoverDataSource.doGetConnection(FailoverDataSource.java:133)
at com.amazon.jdbc.FailoverDataSource.getConnection(FailoverDataSource.java:109)
at com.amazon.identity.accessmanager.WrappedConnection$1.call(WrappedConnection.java:84)
at com.amazon.identity.accessmanager.WrappedConnection$1.call(WrappedConnection.java:82)
at com.amazon.identity.accessmanager.WrappedConnection.getConnection(WrappedConnection.java:110)
... 40 more
Caused by: com.mchange.v2.resourcepool.TimeoutException: A client timed out while waiting to acquire a resource from com.mchange.v2.resourcepool.BasicResourcePool@185e5c6b -- timeout at
awaitAvailable()
at com.mchange.v2.resourcepool.BasicResourcePool.awaitAvailable(BasicResourcePool.java:1317)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:557)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
at com.mchange.v2.resourcepool.BasicResourcePool.prelimCheckoutResource(BasicResourcePool.java:584)
....... (total of 317 such instances of prelimCheckoutResource):
Some excerpts I pulled up from the c3p0 documentation
When a c3p0 DataSource attempts and fails to acquire a Connection, it will retry up to acquireRetryAttempts times, with a delay of acquireRetryDelay between each attempt. If all attempts fail, any clients waiting for Connections from the DataSource will see an Exception, indicating that a Connection could not be acquired. Note that clients do not see any Exception until a full round of attempts fail, which may be some time after the initial Connection attempt. If acquireRetryAttempts is set to 0, c3p0 will attempt to acquire new Connections indefinitely, and calls to getConnection() may block indefinitely waiting for a successful acquisition.
checkoutTimeout limits how long a client will wait for a Connection, if all Connections are checked out and one cannot be supplied immediately
So here's my theory around why this happened:
The network partitioning existed for several minutes. I am assuming by then, the idle connection tests would have invalidated all active connections in the pool. This means that c3p0 would now be involved in getting new connections. If any application hosts tries to obtain connection from pool, it would have to wait indefinitely until connection has been acquired (see excerpt from the c3p0 docs). Also checkout timeout parameter would not have helped in this case since it enforces timeout only if all connections were checked out (and this was not the case).
My question here is the following:
Thanks
I wish my case does not happen to you because I will introduce about C3P0 connection pool library in this article. C3P0 is an easy-to-use library that helps developers apply connection pool pattern into the application easy and efficiently and allow recovering connection from database outage.
In basic, C3P0 wraps a set of DataSource object and manage them by provided configuration. This article is not cover how does C3P0 works internally. Before dig into the coding demo, I would like to introduce how spring boot selects a connection-pool library and how developers can specify their choice.
As we know, the most powerful feature of spring boot is autoconfigure which helps developers create projects faster and codeless. Unfortunately, when I’m writing this article, the spring autoconfigure has not supported C3P0 yet.
The network partitioning existed for several minutes. I am assuming by then, the idle connection tests would have invalidated all active connections in the pool. This means that c3p0 would now be involved in getting new connections. If any application hosts tries to obtain connection from pool, it would have to wait indefinitely until connection has been acquired (see excerpt from the c3p0 docs).
Also checkout timeout parameter would not have helped in this case since it enforces timeout only if all connections were checked out (and this was not the case).
According to the c3p0 documentation: this timeout is enforced "at checkout", not when the connection is already checked-out. So it should help you.
The checkoutTimeout is there to help you with client timeouts so no need to implement anything else; however I would say that trying to obtain a connection indefinitely is a mistake. I'm actually using the default 30 x 1000 ms = 30 seconds timeout.
I would also say that the checkoutTimeout should bigger or equal than the aquire timeout (acquireRetryAttempts * acquireRetryDelay), otherwise the second will apply.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With