Although immutability is good, it is not necessarily going to improve latency. Ensuring low-latency is likely to be platform dependent.
Other than general performance, GC tuning is very important. Reducing memory usage will help GC. In particular if you can reduce the number of middle-aged objects that need to get moved about - keep it object either long lived or short lived. Also avoid anything touching the perm gen.
avoid boxing/unboxing, use primitive variables if possible.
Avoid context switching wherever possible on the message processing path Consequence: use NIO and single event loop thread (reactor)
Buy, read, and understand Effective Java. Also available online
Avoid extensive locking and multi-threading in order not to disrupt the enhanced features in modern processors (and their caches). Then you can use a single thread up to its unbelievable limits (6 million transactions per second) with very low latency.
If you want to see a real world low-latency Java application with enough details about its architecture have a look at LMAX:
The LMAX Architecture
Measure, measure and measure. Use as close to real data with as close to production hardware to run benchmarks regularly. Low latency applications are often better considered as appliances, so you need to consider the whole box deployed not just the particular method/class/package/application/JVM etc. If you do not build realistic benchmarks on production like settings you will have surprises in production.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With