Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I determine what core a Java thread is running on?

I want to implement a CoreLocal map, which works just like ThreadLocal, only it returns a value that is specific to the core the current thread is running on.

The reason for this is that I want to write code that will take a job from a queue, but I want to give priority to jobs that will have their associated data already be in the same L1 cache as the thread picking the job from the queue. So, instead of one job queue for the entire program, I want to have a queue for each core and only when a queue is empty will a worker thread go looking at the queues of other cores.

like image 459
gregw Avatar asked Mar 11 '13 22:03

gregw


2 Answers

I don't think there is any call to get the current CPU currently exposed in the JDK, although it certainly has been previously discussed1 and proposed as a JDK enhancement.

I think until something like that gets implemented your best bet is to use something like JNA (easiest) or JNI (fast) to wrap a native system call like getcpu on Linux or GetCurrentProcessorNumber on Windows.

At least on Linux, getcpu is implemented in the VDSO without a kernel transition, so it should only take a few nanoseconds, plus a few more nanoseconds for the JNI call. JNA is slower.

If you really need speed, you could always add the function as an intrinsic to a bespoke JVM (since OpenJDK is open source). That would shave off several more nanoseconds.

Keep in mind that this information can be out of date as soon as you get it, so you should never rely on it for correctness, only performance. Since you already need to handle getting the "wrong" value, another possible approach is to store the cached value of the CPU ID in a ThreadLocal, and only update it periodically. This makes slow approaches such as parsing the /proc filesystem viable since you do them only infrequently. For maximum speed, you can invalidate the thread-local periodically from a timer thread, rather than checking the invalidation condition on each call.


1 Both the discussion and the enhancement request are highly recommended reading.

like image 164
BeeOnRope Avatar answered Nov 11 '22 03:11

BeeOnRope


There's a related linux question with no satisfactory answer (parsing top output doesn't count and the accepted answer doesn't work anymore). I thought that

/proc/<pid>/task/<tid>/sched

might give this information in a line like

 current_node=0, numa_group_id=0

but on my i5-2400 running 4.4.0-92-generic kernel, this line is always the same for all threads. I guess, "node" means a whole CPU (socket) and I have only one.

I could find no documentation on this, or missed it in this document.


However, I'm afraid that this obtaining this information could improbably help you:

  • Reading from the proc filesystem may be too costly on the scale you're working on.
  • Unlike ThreadLocal, your CoreLocal is not thread-safe: Migrating a thread to another core could spoil even trivial non-atomic operations like someCoreLocalField++. Suspending it would do it, too. So you'd need some atomics or thread-locals to get it working, which again may make it far too slow for what you want.
like image 1
maaartinus Avatar answered Nov 11 '22 04:11

maaartinus