Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

java heap analysis with oql: Count unique strings

Im doing a memory analysis of an existing java software. Is there a sql 'group by' equivalent in oql to see the count of objects with same values but different instances.

select count(*) from java.lang.String s group by s.toString()

I'd like to achieve a list of duplicated strings along with the number of duplicates. The purpose of this is to see the cases with large numbers so that they could be optimized using String.intern().

Example:

"foo"    100
"bar"    99
"lazy fox"    50

etc...

like image 502
paweloque Avatar asked Nov 23 '11 12:11

paweloque


3 Answers

The following is based on the answer by Peter Dolberg and can be used in the VisualVM OQL Console:

var counts={};
var alreadyReturned={};

filter(
  sort(
    map(heap.objects("java.lang.String"),
    function(heapString){
      if( ! counts[heapString.toString()]){
        counts[heapString.toString()] = 1;
      } else {
        counts[heapString.toString()] = counts[heapString.toString()] + 1;
      }
      return { string:heapString.toString(), count:counts[heapString.toString()]};
    }), 
    'lhs.count < rhs.count'),
  function(countObject) {
    if( ! alreadyReturned[countObject.string]){
      alreadyReturned[countObject.string] = true;
      return true;
    } else {
      return false;
    }
   }
  );

It starts by using a map() call over all String instances and for each String creating or updating an object in the counts array. Each object has a string and a count field.

The resulting array will contain one entry for each String instance, each having a count value one larger than the previous entry for the same String. The result is then sorted on the count field and the result looks something like this:

{
count = 1028.0,
string = *null*
}

{
count = 1027.0,
string = *null*
}

{
count = 1026.0,
string = *null*
}

...

(in my test the String "*null*" was the most common).

The last step is to filter this using a function that returns true for the first occurrence of each String. It uses the alreadyReturned array to keep track of which Strings have already been included.

like image 158
Johan Kaving Avatar answered Oct 22 '22 23:10

Johan Kaving


I would use Eclipse Memory Analyzer instead.

like image 9
Palesz Avatar answered Oct 22 '22 23:10

Palesz


Sadly, there isn't an equivalent to "group by" in OQL. I'm assuming you're talking about the OQL that is used in jhat and VisualVM.

There is an alternative, though. If you use pure JavaScript syntax instead of the "select x from y" syntax then you have the full power of JavaScript to work with.

Even so, the alternative way of getting the information you're looking for isn't simple. For example, here's an OQL "query" that will perform the same task as your query:

var set={};
sum(map(heap.objects("java.lang.String"),function(heapString){
  if(set[heapString.toString()]){
    return 0;
  }
  else{
    set[heapString.toString()]=true;
    return 1;
  }
}));

In this example a regular JavaScript object mimics a set (collection with no duplicates). As the the map function goes through each string, the set is used to determine if the string has already been seen. Duplicates don't count toward the total (return 0) but new strings do (return 1).

like image 2
Peter Dolberg Avatar answered Oct 23 '22 00:10

Peter Dolberg