On this Oracle page Java HotSpot VM Options, it lists -XX:+UseCompressedStrings
as being available and on by default. However in Java 6 update 29, it is off by default and in Java 7 update 2 it reports a warning
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option UseCompressedStrings; support was removed in 7.0
Does anyone know the thinking behind removing this option?
sorting lines of an enormous file.txt in java
With -mx2g
, this example took 4.541 seconds with the option on and 5.206 second with it off in Java 6 update 29. It is hard to see that it impacts performance.
Note: Java 7 update 2 requires 2.0 G whereas Java 6 update 29 without compressed strings requires 1.8 GB and with compressed string requires only 1.0 GB.
Which Java Virtual Machine to choose, HotSpot or OpenJ9? Both are tunable open-source JVM implementations. HotSpot is a well-established JVM implementation initially developed by Sun Microsystems. OpenJ9, developed by IBM, is not as widespread in the industry but has gained popularity in recent years.
HotSpot, released as Java HotSpot Performance Engine, is a Java virtual machine for desktop and server computers, developed by Sun Microsystems and now maintained and distributed by Oracle Corporation. It features improved performance via methods such as just-in-time compilation and adaptive optimization.
As of today, there are 4 GC algorithms available in the Java Hotspot VM: The Serial GC - recommended for client-style applications that do not have low pause time requirements. The Parallel GC - use when the throughput matters.
This code heap contains non-method code such as compiler buffers and bytecode interpreter. This code type stays in the code cache forever. The code heap has a fixed size of 3 MB and remaining code cache is distributed evenly among the profiled and non-profiled code heaps.
Originally, this option was added to improve SPECjBB performance. The gains are due to reduced memory bandwidth requirements between the processor and DRAM. Loading and storing bytes in the byte[] consumes 1/2 the bandwidth versus chars in the char[].
However, this comes at a price. The code has to determine if the internal array is a byte[] or char[]. This takes CPU time and if the workload is not memory bandwidth constrained, it can cause a performance regression. There is also a code maintenance price due to the added complexity.
Because there weren't enough production-like workloads that showed significant gains (except perhaps SPECjBB), the option was removed.
There is another angle to this. The option reduces heap usage. For applicable Strings, it reduces the memory usage of those Strings by 1/2. This angle wasn't considered at the time of option removal. For workloads that are memory capacity constrained (i.e. have to run with limited heap space and GC takes a lot of time), this option can prove useful.
If enough memory capacity constrained production-like workloads can be found to justify the option's inclusion, then maybe the option will be brought back.
Edit 3/20/2013: An average server heap dump uses 25% of the space on Strings. Most Strings are compressible. If the option is reintroduced, it could save half of this space (e.g. ~12%)!
Edit 3/10/2016: A feature similar to compressed strings is coming back in JDK 9 JEP 254.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With