I'm seeing a problem with multiple Threads deadlocking on the same line of code. I cannot reproduce the problem locally or in any test, but yet Thread Dumps from Production have show the problem quite clearly.
I can't see why the Threads would become blocked on the synchronized line below, since there is no other synchronization on the Object in the call stack or in any other Thread. Does anyone have any idea what is going on, or how I can even reproduce this issue (Currently trying with 15 Threads all hitting trim() in a loops, while processing 2000 tasks through my Queue - But unable to reproduce)
In the Thread dump below, I think the multiple Threads with the 'locked' status may be a manifestation of Java Bug: http://bugs.java.com/view_bug.do?bug_id=8047816 where JStack reports Threads in wrong state. (I'm using JDK Version: 1.7.0_51)
Cheers!
Here is a view of the Threads in the Thread dump.....
"xxx>Job Read-3" daemon prio=10 tid=0x00002aca001a6800 nid=0x6a3b waiting for monitor entry [0x0000000052ec4000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
- locked <0x00002aae6465a650> (a java.util.ArrayDeque)
at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- <0x00002aaf5f9c2680> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"xxx>Job Read-2" daemon prio=10 tid=0x00002aca001a5000 nid=0x6a3a waiting for monitor entry [0x0000000052d83000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
- locked <0x00002aae6465a650> (a java.util.ArrayDeque)
at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- <0x00002aaf5f9ed518> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"xxx>Job Read-1" daemon prio=10 tid=0x00002aca00183000 nid=0x6a39 waiting for monitor entry [0x0000000052c42000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
- waiting to lock <0x00002aae6465a650> (a java.util.ArrayDeque)
at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Locked ownable synchronizers:
- <0x00002aaf5f9ecde8> (a java.util.concurrent.ThreadPoolExecutor$Worker)
"xxx>Job Read-0" daemon prio=10 tid=0x0000000006a83000 nid=0x6a36 waiting for monitor entry [0x000000005287f000]
java.lang.Thread.State: BLOCKED (on object monitor)
at com.mycompany.collections.CustomQueue.remove(CustomQueue.java:101)
- waiting to lock <0x00002aae6465a650> (a java.util.ArrayDeque)
at com.mycompany.collections.CustomQueue.trim(CustomQueue.java:318)
at com.mycompany.collections.CustomQueue.itemProcessed(CustomQueue.java:302)
at com.mycompany.collections.CustomQueue.trackCompleted(CustomQueue.java:147)
at java.util.concurrent.ThreadPoolExecutor.afterExecute(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
Here is the Java code extracted, which shows where the error is...
public class Deadlock {
final Deque<Object> delegate = new ArrayDeque<>();
final long maxSize = Long.MAX_VALUE;
private final AtomicLong totalExec = new AtomicLong();
private final Map<Object, AtomicLong> totals = new HashMap<>();
private final Map<Object, Deque<Long>> execTimes = new HashMap<>();
public void trim() {
//Possible optimization is evicting in chunks, segmenting by arrival time
while (this.totalExec.longValue() > this.maxSize) {
final Object t = this.delegate.peek();
final Deque<Long> execTime = this.execTimes.get(t);
final Long exec = execTime.peek();
if (exec != null && this.totalExec.longValue() - exec > this.maxSize) {
//If Job Started Inside of Window, remove and re-loop
remove();
}
else {
//Otherwise exit the loop
break;
}
}
}
public Object remove() {
Object removed;
synchronized (this.delegate) { //4 Threads deadlocking on this line !
removed = this.delegate.pollFirst();
}
if (removed != null) {
itemRemoved(removed);
}
return removed;
}
public void itemRemoved(final Object t) {
//Decrement Total & Queue
final AtomicLong catTotal = this.totals.get(t);
if (catTotal != null) {
if (!this.execTimes.get(t).isEmpty()) {
final Long exec = this.execTimes.get(t).pollFirst();
if (exec != null) {
catTotal.addAndGet(-exec);
this.totalExec.addAndGet(-exec);
}
}
}
}
}
From the documentation for HashMap
Note that this implementation is not synchronized. If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally.
(Emphasis theirs)
You are both reading and writing to/from the Maps in an unsynchronized manner.
I see no reason to assume that your code is thread safe.
I suggest that you have an infinite loop in trim caused by this lack of thread safety.
Entering a synchronized block is relatively slow, so it's likely that a thread dump will always show at least a few threads waiting to obtain the lock.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With