Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String literals using 2x the expected amount of permanent generation space

This is Sun JDK 1.6u21, x64.

I have a class for the purpose of experimenting with perm gen usage which contains only a single large string (512k characters):

public class Big0 {
     public String bigString =
         "A string with 2^19 characters, should be 1 MB in size";
}

I check the perm gen usage using getUsage().toString() on the MemoryPoolMXBean object for the permanent generation (called "PS Perm Gen" in u21, although it has slightly different names with different versions, or with different garbage collectors.

When I first reference the class, say by reading Big0.class, perm gen jumps by ~500 KB - that's what I'd expect as the constant pool encoding of the string is UTF-8, and I'm using only ASCII characters.

When I actually create an instance of this class, however, perm gen jumps by ~2 MB. Since this is a 1 MB string in-memory (2 bytes per UTF16 character, certainly no surrogates), I'm confused about why the memory usage is double.

The same effect occurs if I make the string static. If I used final, it fails to compile as I exceed the limit for constant pool items of 65535 bytes (not sure why leaving final off avoids that either - consider that a bonus question).

Any insight appreciated!

Edit: I should also point out that this occurs with non-static, final non-static, and static strings, but not for final static strings. Since that's already a best practice for string constants, maybe this is of mostly academic interest.

like image 887
BeeOnRope Avatar asked Nov 06 '22 02:11

BeeOnRope


1 Answers

I think it's an artefact of your test class. I created a similar class, then decompiled it with javap.

The [eclipse] java compiler breaks the String literal into chunks, each no longer than 64k. The bytecode for initializing the non-constant field consists of cobbling the source string together with a sequence of StringBuilder operations. Although it is this final gigantic string that is interned, the large atoms it is made of take up space in the constant pool.

like image 147
Ron Avatar answered Nov 09 '22 18:11

Ron