Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use PermGen space or roll-my-own intern method?

I am writing a Codec to process messages sent over TCP using a bespoke wire protocol. During the decode process I create a number of Strings, BigDecimals and dates. The client-server access patterns mean that it is common for the client to issue a request and then decode thousands of response messages, which results in a large number of duplicate Strings, BigDecimals, etc.

Therefore I have created an InternPool<T> class allowing me to intern each class of object. Internally, the pool uses a WeakHashMap<T, WeakReference<T>>. For example:

InternPool<BigDecimal> pool = new InternPool<BigDecimal>();

...

// Read BigDecimal from in buffer and then intern.
BigDecimal quantity = pool.intern(readBigDecimal(in));

My question: I am using InternPool for BigDecimal but should I consider also using it for String instead of String's intern() method, which I believe uses PermGen space? What is the advantage of using PermGen space?

like image 776
Adamski Avatar asked May 19 '10 11:05

Adamski


2 Answers

If you already have such a InternPool class, it think it is better to use that than to choose a different interning method for Strings. Especially since String.intern() seems to give a much stronger guarantee than you actually need. Your goal is to reduce memory usage, so perfect interning for the lifetime of the JVM is not actually necessary.

Also, I'd use the Google Collections MapMaker to create a InternPool to avoid re-creating the wheel:

Map<BigDecimal,BigDecimal> bigDecimalPool = new MapMaker()
    .weakKeys()
    .weakValues()
    .expiration(1, TimeUnits.MINUTES)
    .makeComputingMap(
      new Function<BigDecimal, BigDecimal>() {
        public BigDecimal apply(BigDecimal value) {
          return value;
        }
      });

This would give you (correctly implemented) weak keys and values, thread safety, automatic purging of old entries and a very simple interface (a simple, well-known Map). To be sure you could also wrap it using Collections.immutableMap() to avoid bad code messing with it.

like image 97
Joachim Sauer Avatar answered Oct 13 '22 20:10

Joachim Sauer


It is likely that the JVM's String.intern() pool will be faster. AFAIK, it is implemented in native code, so it should be faster and use less space than a pool implemented using WeakHashMap and WeakReference. You would need to do some careful benchmarking to confirm this.

However, unless you have huge numbers of long-lived duplicate objects, I doubt that interning (either in permGen or with your own pools) will make much difference. And if the ratio of unique to duplicate objects is too low, then interning will just increase the number of live objects (making the GC take longer) and reduce performance due the overheads of interning, and so on. So I would also advocate benchmarking the "intern" versus "no intern" approaches.

like image 30
Stephen C Avatar answered Oct 13 '22 19:10

Stephen C