java heap analysis with oql: Count unique strings

Question

Im doing a memory analysis of an existing java software. Is there a sql 'group by' equivalent in oql to see the count of objects with same values but different instances.

select count(*) from java.lang.String s group by s.toString()

I'd like to achieve a list of duplicated strings along with the number of duplicates. The purpose of this is to see the cases with large numbers so that they could be optimized using String.intern().

Example:

"foo"    100
"bar"    99
"lazy fox"    50

etc...

Johan Kaving · Accepted Answer

The following is based on the answer by Peter Dolberg and can be used in the VisualVM OQL Console:

var counts={};
var alreadyReturned={};

filter(
  sort(
    map(heap.objects("java.lang.String"),
    function(heapString){
      if( ! counts[heapString.toString()]){
        counts[heapString.toString()] = 1;
      } else {
        counts[heapString.toString()] = counts[heapString.toString()] + 1;
      }
      return { string:heapString.toString(), count:counts[heapString.toString()]};
    }), 
    'lhs.count < rhs.count'),
  function(countObject) {
    if( ! alreadyReturned[countObject.string]){
      alreadyReturned[countObject.string] = true;
      return true;
    } else {
      return false;
    }
   }
  );

It starts by using a map() call over all String instances and for each String creating or updating an object in the counts array. Each object has a string and a count field.

The resulting array will contain one entry for each String instance, each having a count value one larger than the previous entry for the same String. The result is then sorted on the count field and the result looks something like this:

{
count = 1028.0,
string = *null*
}

{
count = 1027.0,
string = *null*
}

{
count = 1026.0,
string = *null*
}

...

(in my test the String "*null*" was the most common).

The last step is to filter this using a function that returns true for the first occurrence of each String. It uses the alreadyReturned array to keep track of which Strings have already been included.

Palesz · Answer

I would use Eclipse Memory Analyzer instead.

Peter Dolberg · Answer

Sadly, there isn't an equivalent to "group by" in OQL. I'm assuming you're talking about the OQL that is used in jhat and VisualVM.

There is an alternative, though. If you use pure JavaScript syntax instead of the "select x from y" syntax then you have the full power of JavaScript to work with.

Even so, the alternative way of getting the information you're looking for isn't simple. For example, here's an OQL "query" that will perform the same task as your query:

var set={};
sum(map(heap.objects("java.lang.String"),function(heapString){
  if(set[heapString.toString()]){
    return 0;
  }
  else{
    set[heapString.toString()]=true;
    return 1;
  }
}));

In this example a regular JavaScript object mimics a set (collection with no duplicates). As the the map function goes through each string, the set is used to determine if the string has already been seen. Duplicates don't count toward the total (return 0) but new strings do (return 1).

java heap analysis with oql: Count unique strings

Tags:

java

performance

heap-memory

memory

paweloque

3 Answers

Johan Kaving

Palesz

Peter Dolberg

Recent Activity

Donate For Us

java heap analysis with oql: Count unique strings

Tags:

java

performance

heap-memory

memory

paweloque

3 Answers

Johan Kaving

Palesz

Peter Dolberg

Related questions

Recent Activity

Donate For Us