Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

DEADLOCK with C3P0 on java se application

I encountered a critical problem with the c3p0 library (version 0.9.5.2) that I use in my Java SE application.

My application uses a Thread Pool to parallelize task by executing jobs.

Each job uses the database to read, update or delete data at least once up to (in very rare cases but it can happen) more than 10,000 times.

I therefore included in my project c3p0 library to have a connection pool to the database so that all workers in my thread pool can simultaneously interact with it.

I do not have any problems when running my application on my development environment (OSX 10.11), but when I run it in production (Linux Debian 8) I encounter a big problem ! Indeed it freezes....

At first it was a deadlock with the following trace stack:

[WARNING] com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@479d237b -- APPARENT DEADLOCK!!! Creating emergency threads for unassigned pending tasks!
[WARNING] com.mchange.v2.async.ThreadPoolAsynchronousRunner$DeadlockDetector@479d237b -- APPARENT DEADLOCK!!! Complete Status: 
    Managed Threads: 3
    Active Threads: 3
    Active Tasks: 
        com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@264fb34f
            on thread: C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1adv4kd1qtfdi6|659f3099]-HelperThread-#2
        com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@39a5576b
            on thread: C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1adv4kd1qtfdi6|659f3099]-HelperThread-#1
        com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@5e676544
            on thread: C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1adv4kd1qtfdi6|659f3099]-HelperThread-#0
    Pending Tasks: 
        com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask@6848208c
Pool thread stack traces:
    Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1adv4kd1qtfdi6|659f3099]-HelperThread-#2,5,main]
        sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
        sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        com.microsoft.sqlserver.jdbc.SocketFinder.findSocketUsingJavaNIO(IOBuffer.java:2438)
        com.microsoft.sqlserver.jdbc.SocketFinder.findSocket(IOBuffer.java:2290)
        com.microsoft.sqlserver.jdbc.TDSChannel.open(IOBuffer.java:551)
        com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:1962)
        com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:1627)
        com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:1458)
        com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:772)
        com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:1168)
        com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:175)
        com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:220)
        com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:206)
        com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:203)
        com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1138)
        com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1125)
        com.mchange.v2.resourcepool.BasicResourcePool.access$700(BasicResourcePool.java:44)
        com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1870)
        com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696)
    Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1adv4kd1qtfdi6|659f3099]-HelperThread-#1,5,main]
        sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
        sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        com.microsoft.sqlserver.jdbc.SocketFinder.findSocketUsingJavaNIO(IOBuffer.java:2438)
        com.microsoft.sqlserver.jdbc.SocketFinder.findSocket(IOBuffer.java:2290)
        com.microsoft.sqlserver.jdbc.TDSChannel.open(IOBuffer.java:551)
        com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:1962)
        com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:1627)
        com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:1458)
        com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:772)
        com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:1168)
        com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:175)
        com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:220)
        com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:206)
        com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:203)
        com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1138)
        com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1125)
        com.mchange.v2.resourcepool.BasicResourcePool.access$700(BasicResourcePool.java:44)
        com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1870)
        com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696)
    Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1adv4kd1qtfdi6|659f3099]-HelperThread-#0,5,main]
        sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
        sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
        sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
        sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
        sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
        com.microsoft.sqlserver.jdbc.SocketFinder.findSocketUsingJavaNIO(IOBuffer.java:2438)
        com.microsoft.sqlserver.jdbc.SocketFinder.findSocket(IOBuffer.java:2290)
        com.microsoft.sqlserver.jdbc.TDSChannel.open(IOBuffer.java:551)
        com.microsoft.sqlserver.jdbc.SQLServerConnection.connectHelper(SQLServerConnection.java:1962)
        com.microsoft.sqlserver.jdbc.SQLServerConnection.login(SQLServerConnection.java:1627)
        com.microsoft.sqlserver.jdbc.SQLServerConnection.connectInternal(SQLServerConnection.java:1458)
        com.microsoft.sqlserver.jdbc.SQLServerConnection.connect(SQLServerConnection.java:772)
        com.microsoft.sqlserver.jdbc.SQLServerDriver.connect(SQLServerDriver.java:1168)
        com.mchange.v2.c3p0.DriverManagerDataSource.getConnection(DriverManagerDataSource.java:175)
        com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:220)
        com.mchange.v2.c3p0.WrapperConnectionPoolDataSource.getPooledConnection(WrapperConnectionPoolDataSource.java:206)
        com.mchange.v2.c3p0.impl.C3P0PooledConnectionPool$1PooledConnectionResourcePoolManager.acquireResource(C3P0PooledConnectionPool.java:203)
        com.mchange.v2.resourcepool.BasicResourcePool.doAcquire(BasicResourcePool.java:1138)
        com.mchange.v2.resourcepool.BasicResourcePool.doAcquireAndDecrementPendingAcquiresWithinLockOnSuccess(BasicResourcePool.java:1125)
        com.mchange.v2.resourcepool.BasicResourcePool.access$700(BasicResourcePool.java:44)
        com.mchange.v2.resourcepool.BasicResourcePool$ScatteredAcquireTask.run(BasicResourcePool.java:1870)
        com.mchange.v2.async.ThreadPoolAsynchronousRunner$PoolThread.run(ThreadPoolAsynchronousRunner.java:696)

Subsequently I made some changes following the advice on different websites:

System.setProperty("com.mchange.v2.log.MLog", "com.mchange.v2.log.FallbackMLog");
System.setProperty("com.mchange.v2.log.FallbackMLog.DEFAULT_CUTOFF_LEVEL", "WARNING");

// Create db pool
final ComboPooledDataSource cpds = new ComboPooledDataSource() ;

// Driver
cpds.setDriverClass( "com.microsoft.sqlserver.jdbc.SQLServerDriver" ); // loads the jdbc driver                     

// Url
cpds.setJdbcUrl( "jdbc:xxxx://xxxxx:xxxx;database=xxxxx;" );

// Username / Password
cpds.setUser( "xxxx" ) ;
cpds.setPassword( "xxxx" ) ;                                  

// Start size of db pool
cpds.setInitialPoolSize( 8 );

// Min and max db pool size
cpds.setMinPoolSize(  8 ) ;
cpds.setMaxPoolSize( 10 ) ;

// ????
cpds.setNumHelperThreads( 5 ) ;

// Max allowed time to execute statement for a connection
// @See http://stackoverflow.com/questions/14730379/apparent-deadlock-creating-emergency-threads-for-unassigned-pending-tasks
cpds.setMaxAdministrativeTaskTime( 60 ) ;

// ?????
cpds.setMaxStatements( 180 ) ;
cpds.setMaxStatementsPerConnection( 180 ) ;

// ?????
cpds.setUnreturnedConnectionTimeout( 60 ) ;

// ?????
cpds.setStatementCacheNumDeferredCloseThreads(1);

// We make a test : open and close opened connection
cpds.getConnection().close() ;

After these changes, after the execution of some jobs, the application freezes for several ten of seconds and then displays this error message:

[WARNING] A task has exceeded the maximum allowable task time. Will interrupt() thread [Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1ao3z0x88z7oi|4dd889bd]-HelperThread-#4,5,main]], with current task: com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@4128b402
[WARNING] Thread [Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1ao3z0x88z7oi|4dd889bd]-HelperThread-#4,5,main]] interrupted.
[WARNING] A task has exceeded the maximum allowable task time. Will interrupt() thread [Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1ao3z0x88z7oi|4dd889bd]-HelperThread-#3,5,main]], with current task: com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@5d6aab6d
[WARNING] Thread [Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1ao3z0x88z7oi|4dd889bd]-HelperThread-#3,5,main]] interrupted.
[WARNING] A task has exceeded the maximum allowable task time. Will interrupt() thread [Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1ao3z0x88z7oi|4dd889bd]-HelperThread-#0,5,main]], with current task: com.mchange.v2.resourcepool.BasicResourcePool$1DestroyResourceTask@70a3328f
[WARNING] Thread [Thread[C3P0PooledConnectionPoolManager[identityToken->z8kfsx9l1ao3z0x88z7oi|4dd889bd]-HelperThread-#0,5,main]] interrupted.

My questions are :

  1. Why does the application work perfectly in a development environment and encounters these problems during production?
  2. Above all, how to remedy it?
  3. When a connection reaches the maximum number of statements defined with setMaxStatements and setMaxStatementsPerConnection, what happens to it? The connection is closed then another takes over while another one is created?
  4. I did not quite understand the impact that the setStatementCacheNumDeferredCloseThreads function has on my application.

Thank you very much ! Have a good day.

like image 915
Skaÿ Avatar asked Jan 16 '17 20:01

Skaÿ


1 Answers

OK. So. Your basic problem is simple. In your production environment, Connection acquisition attempts are eventually freezing, that is they are neither succeeding nor failing with an Exception, they are simply hanging. Ultimately, this is what you have to debug: Why is it that when c3p0 tries to connect to your production database, sometimes those calls to Driver.connect() hang? Whatever is causing that is outside of c3p0's control. You could be hitting limits in total connections at the DBMS side (not from this application, your maxPoolSize is quite modest, but perhaps your production server is overextended). If you are running on an older JVM, there was a known problem with hangs to SQLServer, see e.g. JDBC connection hangs with no response from SQL Server 2008 r2 Driver.getConnection hangs using SQLServer driver and Java 1.6.0_29 but I doubt you are running Java 6 at this point, and I don't know of more recent issues. In any case, it's quite clear from your logs that this is what is happening: c3p0 is trying to acquire Connections from the DBMS, the DBMS is hanging indefinitely, eventually all of c3p0's helper threads get saturated by hung tasks and you see an APPARENT DEADLOCK. To resolve the issue, you have to debug why attempts by your JDBC driver to connect to your DBMS sometimes hang.

Most of the things you did after scrounging random troubleshooting posts were not very relevant to this issue. The thing that did cause your logs to change was this setting

cpds.setMaxAdministrativeTaskTime( 60 );

That works around the problem in an ugly way. If a task hangs for a long period of time, that setting causes c3p0 to interrupt() the Thread on which it is running and abandon it. That prevents the deadlocks, but doesn't address their cause.

There is a surprising change, though, between the two logs. The replacement of APPARENT DEADLOCK spew with reports that a 'task has exceeded the maximum allowable task time' were to be expected. But interestingly, in your second log, the tasks that get interrupt()ed are not Connection acquisition attempts, but Connection destruction attempts. I don't know why that has changed, but the core issue is the same: attempts by your JDBC driver to interact with your DBMS are freezing indefinitely, nether succeeding nor failing promptly with an Exception. That is what you need to debug.

If you can't resolve the problem, you may be able to work around it. It's very ugly, but if you reduce maxAdministrativeTaskTime (say to 30) and increase numHelperThreads (say to 20), you may be able to largely eliminate the application pauses, as long as the freezes are infrequent. Increasing numHelperThreads increases the number of frozen tasks c3p0's Thread pool can tolerate before being completely blocked. reducing maxAdministrativeTaskTime reduces the lifetime of blockages. Obviously, the right thing to do is to debug the problem between the JDBC driver and the DBMS. But if that proves impossible, sometimes a workaround is the best you can do.

I would eliminate (at least for now) these three settings:

// ?????
cpds.setMaxStatements( 180 ) ;
cpds.setMaxStatementsPerConnection( 180 ) ;

// ?????
cpds.setStatementCacheNumDeferredCloseThreads(1);

The first two turn Statement caching on, which may or may not be desirable from a performance perspective for your application. But they increase the complexity of c3p0's interaction with the DBMS. SQLServer (among several databases) is very fragile with respect to multithreaded use of a Connection (which, at least per early versions of the JDBC spec, should be legal, but too bad). Setting statementCacheNumDeferredCloseThreads to 1 ensures that the Statement cache doesn't try to close an expiring Statement while the Connection is otherwise in use, and so prevents freezes, APPARENT DEADLOCKs that show up usually as hung Statement close tasks, not your issue. If you turn the Statement cache on, by all means keep statementCacheNumDeferredCloseThreads set to 1 to avoid freezes. But the safest, sanest thing is to avoid all the complexity of the Statement cache until you have your main issue debugged. You can restore these settings later to test whether they improve your application's performance. (If you do turn the Statement cache back on, my suggestion is that you just set maxStatementsPerConnection, and do not set a global maxStatements, or if you set both, set the per-Connection limit to a value much smaller than the global limit. But again, for now, just turn all this stuff off.)

To get to your specific questions:

  1. Why does the application work perfectly in a development environment and encounters these problems during production?

That's an important clue that you want to use in debugging the hang between you JDBC driver and your DBMS. Something about your production server leads to a hang that does not show up in your development server. That may just a matter of the relatively low load on your development server and high load on your production server. But there may be other differences in settings that provide clues about the hangs.

  1. Above all, how to remedy it?

Debug the hangs. If you cannot debug the hangs, try working around the issue with a shorter maxAdministrativeTaskTime and a larger numHelperThreads.

  1. When a connection reaches the maximum number of statements defined with setMaxStatements and setMaxStatementsPerConnection, what happens to it? The connection is closed then another takes over while another one is created?

A Connection doesn't reach any of those things. These are parameters that describe the Statement cache. When the total number of cached Statements hits maxStatements, the least-recently-used cached Statement is closed (just the Statement, not its Connection). When a Connection's maxStatementsPerConnection is hit, that Connection's least-recently-used cached Statement is closed (but the Connection itself remains open and active).

  1. I did not quite understand the impact that the setStatementCacheNumDeferredCloseThreads function has on my application.

If you are using the Statement cache (again, I recommend you turn it off for now), this setting ensures that expired Statements (see above) are not close()ed while their parent Connection is in use by some other Thread. The setting creates a dedicated Thread (or Threads) whose sole purpose is to wait for when Connections are no longer in use and close them only then (thus, statement cache deferred close threads).

I hope this helps!

Update: The bug that you are experiencing does look very much like the Java 6 bug. If you are running Java 6, you are in luck, the fix is probably just to update your production JVM to the most recent version of Java 6.

like image 159
Steve Waldman Avatar answered Nov 14 '22 21:11

Steve Waldman