Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java scheduler executor timing issues on virtual windows server

We have a Java application that needs to run, among other environments, on a Virtual (Hyper-V) Windows 2012 R2 Server. When executed on this virtual Windows server, it seems to experience weird timing issues. We've traced the issue to erratic scheduling in a Java scheduled executor:

public static class TimeRunnable implements Runnable {

    private long lastRunAt;

    @Override
    public void run() {
        long now = System.nanoTime();
        System.out.println(TimeUnit.NANOSECONDS.toMillis(now - lastRunAt));
        lastRunAt = now;
    }

}

public static void main(String[] args) {
    ScheduledExecutorService exec = Executors.newScheduledThreadPool(1);
    exec.scheduleAtFixedRate(new TimeRunnable(), 0, 10, TimeUnit.MILLISECONDS);
}

This code, which should run the TimeRunnable every 10ms, produces results such as these on the server:

12
15
2
12
15
0
14
16
2
12
140
0
0
0
0
0
0
0
0
0
0
0
0
1
0
7
15
0
14
16
2
12
15
2
12
1
123
0
0
0

While on other machines, including heavily loaded virtual Linux boxes, as well as some windows desktops, a typical run looks like this:

9
9
10
9
10
9
10
10
9
10
9
9
10
10
9
9
9
9
10
10
9
9
10
10
9
9
10
9
10
10
10
11
8
9
10
9
10
9
10
10
9
9
9
10
9
9
10
10
10
9
10

We don't have a lot of experience with Windows Server and Hyper-V, so can anyone offer an explanation for this phenomena? It it a Windows Server issue? Hyper-V? A Java bug on these platforms? Is there a solution?

EDIT: A colleague has written a C# version of the same program:

private static Stopwatch stopwatch = new Stopwatch();

public static void Main()
{
    stopwatch.Start();
    Timer timer = new Timer(callback, null, TimeSpan.FromMilliseconds(10), TimeSpan.FromMilliseconds(10));
}

private static void callback(object state)
{
    stopwatch.Stop();
    TimeSpan span = stopwatch.Elapsed;
    Console.WriteLine((int)span.TotalMilliseconds);
    stopwatch.Restart();
}

Here's an updated (partial) screenshot of both applications working side by side on the virtual windows server:

enter image description here

EDIT: A few other variants of the Java program all produce (pretty much) the same output:

  1. A variant in which System.nanoTime() was replaced with System.currentTimeMillis()
  2. A variant in which System.out.println() was replaced with a periodically printed StringBuilder
  3. A variant in which the scheduling mechanism was replaced with a single thread that times itself through Thread.sleep()
  4. A variant in which the lastRunAt is volatile
like image 805
Malt Avatar asked Feb 16 '16 11:02

Malt


1 Answers

This is caused by the System.currentTimeMillis() granularity. Note the comment there:

Note that while the unit of time of the return value is a millisecond, the granularity of the value depends on the underlying operating system and may be larger.

I recorded a granularity of about 15ms on one machine a while back. This would explain all of the 0 values you are seeing but not the big values.

Running an enhanced version of your test:

static final TreeMap<Long, AtomicInteger> counts = new TreeMap<>();

public static final AtomicInteger inc(AtomicInteger i) {
    i.incrementAndGet();
    return i;
}

public static class TimeRunnable implements Runnable {

    private long lastRunAt;

    @Override
    public void run() {
        long now = System.nanoTime();
        long took = TimeUnit.NANOSECONDS.toMillis(now - lastRunAt);
        counts.compute(took, (k, v) -> (v == null) ? new AtomicInteger(1) : inc(v));
        //System.out.println(TimeUnit.NANOSECONDS.toMillis(now - lastRunAt));
        lastRunAt = now;
    }

}

public void test() throws InterruptedException {
    System.out.println("Hello");
    ScheduledExecutorService exec = Executors.newScheduledThreadPool(1);
    exec.scheduleAtFixedRate(new TimeRunnable(), 0, 10, TimeUnit.MILLISECONDS);
    // Wait a bit.
    Thread.sleep(10000);
    // Shut down.
    exec.shutdown();
    while (!exec.awaitTermination(60, TimeUnit.SECONDS)) {
        System.out.println("Waiting");
    }
    System.out.println("counts - " + counts);
}

I get the output:

counts - {0=361, 2=1, 8=2, 13=2, 14=18, 15=585, 16=25, 17=1, 18=1, 22=1, 27=1, 62=1, 9295535=1}

The huge outlier is the first hit - when lastRunAt is zero. The 0=361 was when you were called 10ms later but System.currentTimeMillis() had not kicked over one of it's ticks. Note the peak at 15=585 showing a clear peak at 15ms as I suggested.

I have no explanation for the 62=1.

like image 124
OldCurmudgeon Avatar answered Nov 06 '22 08:11

OldCurmudgeon