Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Garbage collection behaviour for String.intern()

If I use String.intern() to improve performance as I can use "==" to compare interned string, will I run into garbage collection issues? How does the garbage collection mechanism of interned strings differ from normal strings ?

like image 282
Ravi Gupta Avatar asked Mar 12 '10 09:03

Ravi Gupta


People also ask

Do interned strings get garbage collected?

Manually interned strings will be garbage-collected. String literals will be only garbage collected if the class that defines them is unloaded.

What is intern () in String class?

The method intern() creates an exact copy of a String object in the heap memory and stores it in the String constant pool. Note that, if another String with the same contents exists in the String constant pool, then a new object won't be created and the new reference will point to the other String.

What is the use of the intern () method?

The intern() method creates an exact copy of a String that is present in the heap memory and stores it in the String constant pool. Takeaway - intern() method is used to store the strings that are in the heap in the string constant pool if they are not already present.

What is String intern () When and why should it be used?

String Interning is a method of storing only one copy of each distinct String Value, which must be immutable. By applying String. intern() on a couple of strings will ensure that all strings having the same contents share the same memory.


3 Answers

String.intern() manages an internal, native-implemented pool, which has some special GC-related features. This is old code, but if it were implemented anew, it would use a java.util.WeakHashMap. Weak references are a way to keep a pointer to an object without preventing it from being collected. Just the right thing for a unifying pool such as interned strings.

That interned strings are garbage collected can be demonstrated with the following Java code:

public class InternedStringsAreCollected {

    public static void main(String[] args)
    {
        for (int i = 0; i < 30; i ++) {
            foo();  
            System.gc();
        }   
    }

    private static void foo()
    {
        char[] tc = new char[10];
        for (int i = 0; i < tc.length; i ++)
            tc[i] = (char)(i * 136757);
        String s = new String(tc).intern();
        System.out.println(System.identityHashCode(s));
    }
}

This code creates 30 times the same string, interning it each time. Also, it uses System.identityHashCode() to show what hash code Object.hashCode() would have returned on that interned string. When run, this code prints out distinct integer values, which means that you do not get the same instance each time.

Anyway, usage of String.intern() is somewhat discouraged. It is a shared static pool, which means that it easily turns into a bottleneck on multi-core systems. Use String.equals() to compare strings, and you will live longer and happier.

like image 120
Thomas Pornin Avatar answered Oct 03 '22 11:10

Thomas Pornin


In fact, this not a garbage collection optimisation, but rather a string pool optimization. When you call String.intern(), you replace reference to your initial String with its base reference (the reference of the first time this string was encountered, or this reference if it is not yet known).

However, it will become a garbage collector issue once your string is of no more use in application, since the interned string pool is a static member of the String class and will never be garbage collected.

As a rule of thumb, i consider preferrable to never use this intern method and let the compiler use it only for constants Strings, those declared like this :

String myString = "a constant that will be interned";

This is better, in the sense it won't let you do the false assumption == could work when it won't.

Besides, the fact is String.equals underlyingly calls == as an optimisation, making it sure interned strings optimization are used under the hood. This is one more evidence == should never be used on Strings.

like image 30
Riduidel Avatar answered Oct 03 '22 11:10

Riduidel


This article provides the full answer.

In java 6 the string pool resides in the PermGen, since java 7 the string pool resides in the heap memory.

Manually interned strings will be garbage-collected.
String literals will be only garbage collected if the class that defines them is unloaded.

The string pool is a HashMap with fixed size which was small in java 6 and early versions of java 7, but increased to 60013 since java 7u40.
It can be changed with -XX:StringTableSize=<new size> and viewed with -XX:+PrintFlagsFinal java options.

like image 7
Alexander Avatar answered Oct 03 '22 13:10

Alexander