Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is this method a hot spot?

I'm writing a (simple!) linear algebra library. In the implementation of matrix multiplication, a VisualVM performance sample is telling me that the algorithm spends 85% of its time ("self time", specifically) in the following method when multiplying large matrices (5k x 120k):

public double next() {
    double result;

    if(hasNext())
        result = vis[i++].next();
    else
        throw new IllegalStateException("No next value");

    return result;
}

Without going into too much detail (sorry, I can't share more code), this method is the next() method of an "iterator" for the matrix. (You can think of the class this method lives in as a row iterator made up of individual column iterators, which are stored in vis.) I'm not surprised that this method gets called a lot since it's an iterator, but I am surprised that the program spends a lot of its time in this method. This method doesn't do too much, so why is it spending its time here?

Here are the specific questions I'm asking:

  1. Is there some "gotcha" of VisualVM I'm hitting? For example, could the JIT be confusing VisualVM in some way that's causing VisualVM to attribute time to the wrong method?
  2. Why would the program be spending its time here? The method just doesn't do too much. In particular, I don't think cache effects explain this problem because the vis array is much smaller than the data of the matrices being multiplied.

In case it's useful, here's a jad disassembly of the method I pasted above:

public double next()
{
    double result;
    if(hasNext())
//*   0    0:aload_0         
//*   1    1:invokevirtual   #88  <Method boolean hasNext()>
//*   2    4:ifeq            32
        result = vis[i++].next();
//    3    7:aload_0         
//    4    8:getfield        #42  <Field VectorIterator[] vis>
//    5   11:aload_0         
//    6   12:dup             
//    7   13:getfield        #28  <Field int i>
//    8   16:dup_x1          
//    9   17:iconst_1        
//   10   18:iadd            
//   11   19:putfield        #28  <Field int i>
//   12   22:aaload          
//   13   23:invokeinterface #72  <Method double VectorIterator.next()>
//   14   28:dstore_1        
    else
//*  15   29:goto            42
        throw new IllegalStateException("No next value");
//   16   32:new             #89  <Class IllegalStateException>
//   17   35:dup             
//   18   36:ldc1            #91  <String "No next value">
//   19   38:invokespecial   #93  <Method void IllegalStateException(String)>
//   20   41:athrow          
    return result;
//   21   42:dload_1         
//   22   43:dreturn         
}

Thanks in advance for your help guys!

like image 317
sigpwned Avatar asked Apr 17 '13 16:04

sigpwned


People also ask

What is hot spot method?

Overview. Hotspot analysis is a spatial analysis and mapping technique interested in the identification of clustering of spatial phenomena. These spatial phenomena are depicted as points in a map and refer to locations of events or objects.

What is a hot spot Why is this significant?

“a hot spot is an area that has a greater than average number of criminal or disorder events, or an area where people have a higher than average risk of victimization.” An area can be considered a hotspot if a higher than average occurrence of the the event being analysed is found in a cluster.

How do you identify a hot spot?

Analysts use geographic information systems (GIS) to combine street maps, data about crime and public disorder, and data about other features such as schools, liquor stores, warehouses and bus stops. The resulting multidimensional maps produce a visual display of the hot spots.

What is a hot spot answer?

A hot spot is an area on Earth over a mantle plume or an area under the rocky outer layer of Earth, called the crust, where magma is hotter than surrounding magma. The magma plume causes melting and thinning of the rocky crust and widespread volcanic activity.


3 Answers

I figured out that this method looked like a hotspot because VisualVM was being instructed to ignore methods from the JRE in its profiling. The time being spent in those "ignored" methods was (ostensibly) being rolled into the self time of the topmost non-ignored entry of the call stack.

Below is the settings screen in VisualVM, including the "Do not profile packages" setting that was making the data wrong. To adjust the "ignore class" settings, you have to (1) click the "Settings" checkbox highlighted in red, then (2) adjust the classes setting highlighted in blue.

VisualVM Settings Screen

Depending on what you're doing, it probably makes sense at least not to ignore the java.* and javax.* packages.

like image 60
sigpwned Avatar answered Nov 05 '22 06:11

sigpwned


I don't know VisualVM from experience.

First determine if its instrumenting the bytecode to gather statistics. If so, look no further - instrumenting a short method always overinflates its self time (measuring the time and incrementing the statistics counter costs more time than the method itself).

But its always possible that the iterator consumes more time than the calculation itself. Imagine just summing up a matrix. Adding a float value to a local sum variable costs a lot less time than invoking a method, checking an invariant and finally accessing the array.

like image 25
Durandal Avatar answered Nov 05 '22 04:11

Durandal


Forget the profiler. Just pause the darn thing a few times and examine the stack. If 85% of the time goes into that routine, then chances are 85% on each pause that you will see exactly where it is in that routine, and exactly where it comes from. You can even see where it is in the process of multiplying the matrices. Thousands of samples won't tell you that.

My own sense is that calling that function, then doing hasNext, then doing Next on every single element is going to be a lot slower than i++.

like image 29
Mike Dunlavey Avatar answered Nov 05 '22 06:11

Mike Dunlavey