Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I optimize the code to always use the same reference to a String without increasing the memory usage?

I am working on an application that has a lot of duplicate Strings and my task is to eliminate them to decrease memory usage. My first thought was to use String.intern to guarantee that only one reference of a String would exist. It worked to decrease the heap memory, but it increased the PermGen way too much; in fact, because there are many strings that are declared only once, the total amount of memory used by the application increased, actually.

After searching for another ideas, I found this approach: https://stackoverflow.com/a/725822/1384913.

It happened the same thing as String.intern: The String usage decreased, but the memory that I saved is being used in the WeakHashMap and WeakHashMap$Entry classes.

Is there an effective way to maintain only one reference for each String that doesn't spend the same amount of memory that I'm recovering doing it?

like image 306
Daniel Pereira Avatar asked Oct 15 '12 18:10

Daniel Pereira


1 Answers

I found an alternative to WeakHashMap: the WeakHashSet provided by Eclipse JDT library. It has the same behaviour that WeakHashMap, but it uses less memory. Also, you only need to call the method add and it will add the String in the set if it doesn't exist yet, or returning the existing one otherwise.

The only thing that I didn't like was the fact that it doesn't use generics, forcing the developer to cast the objects. My intern method turned out to be pretty simple, as you can see bellow:

Declaration of the WeakHashSet:

private static WeakHashSet stringPool = new WeakHashSet(30000); //30 thousand is the average number of Strings that the application keeps.

and the intern method:

public static String intern(String value) {
    if(value == null) {
        return null;
    }
    return (String) stringPool.add(value);
}
like image 137
Daniel Pereira Avatar answered Sep 21 '22 06:09

Daniel Pereira