Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Support for Compressed Strings being Dropped in HotSpot JVM?

On this Oracle page Java HotSpot VM Options, it lists -XX:+UseCompressedStrings as being available and on by default. However in Java 6 update 29, it is off by default and in Java 7 update 2 it reports a warning

Java HotSpot(TM) 64-Bit Server VM warning: ignoring option UseCompressedStrings; support was removed in 7.0 

Does anyone know the thinking behind removing this option?


sorting lines of an enormous file.txt in java

With -mx2g, this example took 4.541 seconds with the option on and 5.206 second with it off in Java 6 update 29. It is hard to see that it impacts performance.

Note: Java 7 update 2 requires 2.0 G whereas Java 6 update 29 without compressed strings requires 1.8 GB and with compressed string requires only 1.0 GB.

like image 619
Peter Lawrey Avatar asked Jan 12 '12 10:01

Peter Lawrey


People also ask

What is the difference between the HotSpot JVM and OpenJ9?

Which Java Virtual Machine to choose, HotSpot or OpenJ9? Both are tunable open-source JVM implementations. HotSpot is a well-established JVM implementation initially developed by Sun Microsystems. OpenJ9, developed by IBM, is not as widespread in the industry but has gained popularity in recent years.

What is HotSpot in JVM?

HotSpot, released as Java HotSpot Performance Engine, is a Java virtual machine for desktop and server computers, developed by Sun Microsystems and now maintained and distributed by Oracle Corporation. It features improved performance via methods such as just-in-time compilation and adaptive optimization.

Which of the following algorithms are available in the Java HotSpot VM?

As of today, there are 4 GC algorithms available in the Java Hotspot VM: The Serial GC - recommended for client-style applications that do not have low pause time requirements. The Parallel GC - use when the throughput matters.

What is Codeheap?

This code heap contains non-method code such as compiler buffers and bytecode interpreter. This code type stays in the code cache forever. The code heap has a fixed size of 3 MB and remaining code cache is distributed evenly among the profiled and non-profiled code heaps.


1 Answers

Originally, this option was added to improve SPECjBB performance. The gains are due to reduced memory bandwidth requirements between the processor and DRAM. Loading and storing bytes in the byte[] consumes 1/2 the bandwidth versus chars in the char[].

However, this comes at a price. The code has to determine if the internal array is a byte[] or char[]. This takes CPU time and if the workload is not memory bandwidth constrained, it can cause a performance regression. There is also a code maintenance price due to the added complexity.

Because there weren't enough production-like workloads that showed significant gains (except perhaps SPECjBB), the option was removed.

There is another angle to this. The option reduces heap usage. For applicable Strings, it reduces the memory usage of those Strings by 1/2. This angle wasn't considered at the time of option removal. For workloads that are memory capacity constrained (i.e. have to run with limited heap space and GC takes a lot of time), this option can prove useful.

If enough memory capacity constrained production-like workloads can be found to justify the option's inclusion, then maybe the option will be brought back.

Edit 3/20/2013: An average server heap dump uses 25% of the space on Strings. Most Strings are compressible. If the option is reintroduced, it could save half of this space (e.g. ~12%)!

Edit 3/10/2016: A feature similar to compressed strings is coming back in JDK 9 JEP 254.

like image 122
Nathan Avatar answered Oct 14 '22 16:10

Nathan