Optimising Java objects for CPU cache line efficiency

I'm writing a library where:

It will need to run on a wide range of different platforms / Java implementations (the common case is likely to be OpenJDK or Oracle Java on Intel 64 bit machines with Windows or Linux)
Achieving high performance is a priority, to the extent that I care about CPU cache line efficiency in object access
In some areas, quite large graphs of small objects will be traversed / processed (let's say around 1GB scale)
The main workload is almost exclusively reads
Reads will be scattered across the object graph, but not totally randomly (i.e. there will be significant hotspots, with occasional reads to less frequently accessed areas)
The object graph will be accessed concurrently (but not modified) by multiple threads. There is no locking, on the assumption that concurrent modification will not occur.

Are there some rules of thumb / guidelines for designing small objects so that they utilise CPU cache lines effectively in this kind of environment?

I'm particularly interested in sizing and structuring the objects correctly, so that e.g. the most commonly accessed fields fit in the first cache line etc.

Note: I am fully aware that this is implementation dependent, that I will need to benchmark, and of the general risks of premature optimization. No need to waste any further bandwidth pointing this out. :-)

How do I optimize my CPU cache?

Rules of thumb for better CPU cache usageAvoid using algorithms and data structures that exhibit irregular memory access patterns; use linear data structures instead. Use smaller data types and organize the data so there aren't any alignment holes.

What is Cache false sharing?

False sharing occurs when threads on different processors modify variables that reside on the same cache line.

A first step towards cache line efficiency is to provide for referential locality (i.e. keeping your data close to each other). This is hard to do in JAVA where almost everything is system allocated and accessed by reference.

To avoid references, the following might be obvious:

have non-reference types (i.e. int, char, etc.) as fields in your objects
keep your objects in arrays
keep your objects small

These rules will at least ensure some referential locality when working on a single object and when traversing the object references in your object graph.

Another approach might be to not use object for your data at all, but have global non-ref typed arrays (of same size) for each item that would normally be a field in your class and then each instance would be identified by a common index into these arrays.

Then for optimizing the size of the arrays or chunks thereof, you have to know the MMU characteristics (page/cache size, number of cache lines, etc). I don't know if JAVA provides this in the System or Runtime classes, but you could pass this information as system properties on start up.

Of course this is totally orthogonal to what you should normally be doing in JAVA :)

Best regards

You may require information about the various caches of your CPU, you can access it from Java using Cachesize (currently supporting Intel CPUs). This can help to develop cache-aware algorithms.

Disclaimer : author of the lib.

Optimising Java objects for CPU cache line efficiency

Tags:

java

performance

optimization

cpu-cache

mikera

People also ask

2 Answers

Gregor Ophey

Julien

Recent Activity

Donate For Us

Optimising Java objects for CPU cache line efficiency

Tags:

java

performance

optimization

cpu-cache

mikera

People also ask

2 Answers

Gregor Ophey

Julien

Related questions

Recent Activity

Donate For Us