Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selector.select(timeout) returns 0 before timeout

Tags:

java

nio

According to Javadoc,

It returns only after at least one channel is selected, this selector's wakeup method is invoked, the current thread is interrupted, or the given timeout period expires, whichever comes first.

But occasionally it returns without any of these 4 cases:

  1. at least one channel is selected: it returns 0
  2. wakeup method is invoked: wakeup is not called
  3. the current thread is interrupted: Thread.interrupted() returns false
  4. given timeout period expires: not expired according to logs

UPDATED 2016-03-15

In my source at line 392 and line 402 I added some logs: https://github.com/xqbase/tuna/blob/debug/core/src/main/java/com/xqbase/tuna/ConnectorImpl.java

public boolean doEvents(long timeout) {
    Log.v("Before Select: " + timeout);
    int keySize;
    try {
        keySize = timeout == 0 ? selector.selectNow() :
                timeout < 0 ? selector.select() : selector.select(timeout);
    } catch (IOException e) {
        throw new RuntimeException(e);
    }
    Set<SelectionKey> selectedKeys = selector.selectedKeys();
    if (keySize == 0) {
        Log.v("After Select(0): selectedKeys=" + selectedKeys.size() + ", " +
                "interrupt=" + Thread.interrupted());
        invokeQueue();
        return false;
    }

    for (SelectionKey key : selectedKeys) {
        ...

Here is the log:

...
2016-03-15 23:07:49.695 com.xqbase.tuna.ConnectorImpl doEvents
FINE: Before Select: 8120
2016-03-15 23:07:49.696 com.xqbase.tuna.ConnectorImpl doEvents
FINE: After Select(0): selectedKeys=0, interrupt=false
2016-03-15 23:07:49.696 com.xqbase.tuna.ConnectorImpl doEvents
FINE: Before Select: 8119
2016-03-15 23:07:49.696 com.xqbase.tuna.ConnectorImpl doEvents
FINE: After Select(0): selectedKeys=0, interrupt=false
2016-03-15 23:07:49.700 com.xqbase.tuna.ConnectorImpl doEvents
FINE: Before Select: 8115
2016-03-15 23:07:49.701 com.xqbase.tuna.ConnectorImpl doEvents
FINE: After Select(0): selectedKeys=0, interrupt=false
2016-03-15 23:07:49.701 com.xqbase.tuna.ConnectorImpl doEvents
FINE: Before Select: 8114
2016-03-15 23:07:49.702 com.xqbase.tuna.ConnectorImpl doEvents
FINE: After Select(0): selectedKeys=0, interrupt=false
...

That is very strange: no selected keys, no interruption, no timeout and no wakeup, but it returned.

Is there a bug in Java? My Java version is 1.8.0_51-b16 (64-Bit Server VM), and run on a CentOS 6.5 x64 linode.

like image 367
auntyellow Avatar asked Sep 25 '22 03:09

auntyellow


1 Answers

This may really be a bug in JDK. It seems that Netty and Mina also encounter such a problem and they rebuild the selector as a workaround.

See latest Netty code https://github.com/netty/netty/blob/4.1/transport/src/main/java/io/netty/channel/nio/NioEventLoop.java L641-681:

            if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
                // - Selected something,
                // - waken up by user, or
                // - the task queue has a pending task.
                // - a scheduled task is ready for processing
                break;
            }
            ...
            } else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
                    selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {
                // The selector returned prematurely many times in a row.
                // Rebuild the selector to work around the problem.
                logger.warn(
                        "Selector.select() returned prematurely {} times in a row; rebuilding selector.",
                        selectCnt);

                rebuildSelector();
                selector = this.selector;

                // Select again to populate selectedKeys.
                selector.selectNow();
                selectCnt = 1;
                break;
            }

See Mina 2.0 code https://github.com/apache/mina/blob/2.0/mina-core/src/main/java/org/apache/mina/core/polling/AbstractPollingIoProcessor.java L1070-1092:

                if (!wakeupCalled.getAndSet(false) && (selected == 0) && (delta < 100)) {
                    // Last chance : the select() may have been
                    // interrupted because we have had an closed channel.
                    if (isBrokenConnection()) {
                        LOG.warn("Broken connection");
                    } else {
                        LOG.warn("Create a new selector. Selected is 0, delta = " + (t1 - t0));
                        // Ok, we are hit by the nasty epoll
                        // spinning.
                        // Basically, there is a race condition
                        // which causes a closing file descriptor not to be
                        // considered as available as a selected channel,
                        // but
                        // it stopped the select. The next time we will
                        // call select(), it will exit immediately for the
                        // same
                        // reason, and do so forever, consuming 100%
                        // CPU.
                        // We have to destroy the selector, and
                        // register all the socket on a new one.
                        registerNewSelector();
                    }
                }

Therefore, registering a new selector may be the best practice if select() returns an unexpected zero.

like image 139
auntyellow Avatar answered Nov 09 '22 17:11

auntyellow