Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Single Thread Executor Silently Drops Tasks

I am struggling with an issue where, after working smoothly for most of the the day, a callable task is put into a Java Single Thread Executor and apparently never get's executed. Subsequent calls to submit a new task fail and the ExecutorService seems to be dead. At this point the client producing the tasks is out of service until the process can be restarted which is not possible during business hours.

Some background: Multiple high-throughput producer threads place their tasks onto their own dedicated Single Thread ExecutorService and return immediately. Low latency is very important for the producer threads. There is a one to one relationship between the producer threads and the executor threads. The tasks need to be processed in order for each producer thread. The tasks can get queued up in the executor thread and take as long as they need to execute. The traffic is bursty so the consumers always catch up with their producers.

JDK: jdk1.8.0_92 on RedHat Linux

I define my Executor Service:

private final ExecutorService inboundMsgSender = Executors.newSingleThreadExecutor();

The producer threads invoke a callback:

public void onMessageFromFix(MessageEvent event, final Message message) {
    log.info("submit to Executor: " + message.toString());
    inboundMsgSender.submit(new Callable<Void>() {
        public Void call() {
            try {
                onMessageFromExecutor(event, message);
            } catch (Throwable e) {
                log.error("error", e);
            }
            return null;
        }
    });
}

The ExecutorService invokes the callable:

    public void onMessageFromExecutor(MessageEvent event, final Message message) {
    try {
        log.info("call from Executor: " + message.toString());
        doExpensiveLogic(message);
    } catch (Exception e) {
        log.error("error", e);
    }
}

Under normal conditions I see in the log file:

submit to Executor: 4928

call from Executor: 4928

This is how I know the Executor thread is running the Callable.

When the issue occurs, I only see the following:

submit to Executor: 4928

with no subsequent call from Executor and no Exceptions.

like image 784
Michael Starkie Avatar asked Nov 09 '22 01:11

Michael Starkie


1 Answers

The reason why the callable task is never executed is because the thread inside the inboundMsgSender Single Thread ExecutorService is blocked waiting on FutureTask.get() inside the `doExpensiveLogic(message) from a previous call.

The lesson here is that I assumed the ExecutorService's thread was dying when it just blocked. Thread death is handled by ExecutorService so I waited for the issue to happen again and I took a Thread dump using JStack. The Thread dump shows exactly where the executor service's thread is blocked.

"pool-54-thread-1" #354 prio=5 os_prio=0 tid=0x567c3c00 nid=0xae4a waiting on condition [0x51125000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x69458368> (a com.aqua.api.SequentialExecutorService$ClientTaskHandle)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:429)
    at java.util.concurrent.FutureTask.get(FutureTask.java:191)
    at com.aqua.jms.multiserver.impl.MultiServerJmsConnection.isConsumerConfigured(MultiServerJmsConnection.java:301)
    at com.aqua.jms.multiserver.migration.MigrationConnectionWrapper.getAdministrationConnection(MigrationConnectionWrapper.java:152)

Steps I took when it happened again:

  1. Identify the thread name of the executor service's single thread.
  2. On linux, identify the PID of the process.
  3. Use jstack to take a thread dump of the PID $ jstack 33516 > threaddump.txt
  4. Search the thread dump for the thread name (see above).

You can clearly see from the stack trace that the thread is LIVE and WAITING on a FutureTask.get() so all that needs to be done is to fix the Future Task or refactor the logic out of it and make it available for my thread to call directly.

like image 144
Michael Starkie Avatar answered Nov 15 '22 06:11

Michael Starkie