To resolve this issue I have created an open source Java Thread Affinity library
When I have a number of thread interacting closely it can reduce latency and increase throughput. For single threaded tasks it can still reduce jitter quite a bit.
This program looks at the difference in time between calls to System.nanoTime()
and reports those over 10x,000 ns.
public class TimeJumpingMain { static final long IGNORE_TIME = 1000 * 1000 * 1000; // the first second to allow warmup. static final int minJump = 10; // smallest jump of 10 us. static final int midJump = 100; // mid size jump of 100 us. static final int bigJump = 1000; // big jump of 1 ms. public static void main(String... args) { int[] intervalTimings = new int[1000]; int[] jumpTimings = new int[1000]; long start = System.nanoTime(); long prev = start; long prevJump = start; int jumpCount = 0; int midJumpCount = 0; int bigJumpCount = 0; while (true) { long now = System.nanoTime(); long jump = (now - prev) / 1000; if (jump > minJump && now - start > IGNORE_TIME) { long interval = (now - prevJump) / 1000; if (jumpCount < intervalTimings.length) { intervalTimings[jumpCount] = (int) interval; jumpTimings[jumpCount] = (int) jump; } if (jump >= midJump) midJumpCount++; if (jump >= bigJump) bigJumpCount++; prevJump = now; jumpCount++; } prev = now; if (now - start > 120L * 1000 * 1000 * 1000 + IGNORE_TIME) break; } System.out.println("interval us\tdelay us"); for (int i = 0; i < jumpCount && i < intervalTimings.length; i++) { System.out.println(intervalTimings[i] + "\t" + jumpTimings[i]); } System.out.printf("Time jumped %,d / %,d / %,d times by at least %,d / %,d / %,d us in %.1f seconds %n", jumpCount, midJumpCount, bigJumpCount, minJump, midJump, bigJump, (System.nanoTime() - start - IGNORE_TIME) / 1e9); } }
on my machine this reports
Time jumped 2,905 / 131 / 20 times by at least 10 / 100 / 1,000 us in 120.0 seconds
I have tried chrt
to set real time priority and taskset
to try to lock to a single core AFTER STARTING the process but these didn't help as I expected.
I configured the box to move all interrupts to cpus 0-3 and the cpu mask for all process with 0xFF to 0x0F. In top
the first four cpus are ~99% idle and the last four cpus are 100.0% idle.
Using chrt -r 99
as root
Time jumped 673 / 378 / 44 times by at least 10 / 100 / 1,000 us in 120.0 seconds
However, when using taskset -c 7
alone (I have made sure cpu7 is free)
Time jumped 24 / 1 / 0 times by at least 10 / 100 / 1,000 us in 120.0 seconds
Using chrt - r 99 taskset -c 7
Time jumped 7 / 1 / 0 times by at least 10 / 100 / 1,000 us in 120.0 seconds
It appears that trying to use taskset after the process had started wasn't working for me.
The broader question is;
How to reduce jitter for a Java process? Are there any more tips for reducing jitter on Linux?
NOTE: No GC occurs during the running of this process (checked with -verbosegc)
It appears that code compiling may cause a delay of 3.62 ms every time after 100 - 102 ms. For this reason I ignore everything in the first second as warmup.
There's system jitter and JVM jitter.
For the former you can use the isolcpus parameter at boot time to ensure that nothing but your application code can run on those cpus
http://www.novell.com/support/viewContent.do?externalId=7009596&sliceId=1
Ideally you'd do a jni call (to your own jni lib) down to sched_setaffinity
just for the active thread so that you really do have nothing but that thread running there.
In my experience, system jitter is minimised by use of isolcpus with interrupts handled by specific cores only, hyper threading switched off and absolutely all use of power management removed (these are bios options when they're available to turn off all the c-state & p-state management) while running your app on shielded cores. The bios specific options are obviously specific to your motherboard so you'll need to look into that based on your motherboard model.
Another thing look at at the system level is the local APIC interrupt (LOC, local interrupt counter) frequency. Is this a "low latency desktop" using 1kHz interrupts? either way, you can expect jitter to be clustered around the interrupt interval
2 more I know practically nothing about but am aware of as sources of jitter; kernel tlb flush interrupts and userspace tlb flush interrupts. Some RT kernels offer options to control these so this might be another thing to look into. You can also look at this site about building RT apps on the RT kernel for more tips.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With