Using ehCache 2.4.4, I seem to have gotten into a deadlock on the ehCache Segment object. From other logging, I know that the 'waiting thread', 1694 last ran anything 9 hours before this stack trace was generated. In the meantime, 1696 has gone and done a lot of other work, so this lock is definitely being held errantly.
I'm pretty confident that I am not directly locking any Segment instances directly, so I assume this is some kind of issue internal to the library. Any ideas?
"Model Executor - 1696" Id=1696 in TIMED_WAITING on lock=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@92eb1ed
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.parkNanos(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown Source)
at java.util.concurrent.PriorityBlockingQueue.poll(Unknown Source)
at com.rtrms.application.modeling.local.BlockingTaskList.takeTask(BlockingTaskList.java:20)
at com.rtrms.application.modeling.local.ModelExecutor.executeNextTask(ModelExecutor.java:71)
at com.rtrms.application.modeling.local.ModelExecutor.run(ModelExecutor.java:46)
Locked synchronizers: count = 1
- java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@4a3d767f
"Model Executor - 1694" Id=1694 in WAITING on lock=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@4a3d767f
owned by Model Executor - 1696 Id=1696
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(Unknown Source)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(Unknown Source)
at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(Unknown Source)
at net.sf.ehcache.store.compound.Segment.unretrievedGet(Segment.java:248)
at net.sf.ehcache.store.compound.CompoundStore.unretrievedGet(CompoundStore.java:191)
at net.sf.ehcache.store.compound.impl.DiskPersistentStore.containsKeyInMemory(DiskPersistentStore.java:72)
at net.sf.ehcache.Cache.searchInStoreWithStats(Cache.java:1884)
at net.sf.ehcache.Cache.get(Cache.java:1549)
at com.rtrms.amoeba.cache.DistributedModeledSecurities.get(DistributedModeledSecurities.java:57)
at com.rtrms.amoeba.modeling.AssertPersistedModeledSecurities.get(AssertPersistedModeledSecurities.java:44)
at com.rtrms.application.modeling.tasks.ExpandableModelingTask.getNextUnexecutedTask(ExpandableModelingTask.java:35)
at com.rtrms.application.modeling.local.BlockingTaskList.takeTask(BlockingTaskList.java:36)
at com.rtrms.application.modeling.local.ModelExecutor.executeNextTask(ModelExecutor.java:71)
at com.rtrms.application.modeling.local.ModelExecutor.run(ModelExecutor.java:46)
Locked synchronizers: count = 0
Turns out that calls like Cache.acquireWriteLockOnKey end up obtaining a lock on the internal Segment, so this apparent deadlock was caused by a .unlock call that wasn't in a finally block.
Editorial comment: It also implies that you can get contention trying to lock two different keys that just happened to be in the same Segment, which is pretty unfortunate.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With