I'm writing a (simple!) linear algebra library. In the implementation of matrix multiplication, a VisualVM performance sample is telling me that the algorithm spends 85% of its time ("self time", specifically) in the following method when multiplying large matrices (5k x 120k):
public double next() {
double result;
if(hasNext())
result = vis[i++].next();
else
throw new IllegalStateException("No next value");
return result;
}
Without going into too much detail (sorry, I can't share more code), this method is the next()
method of an "iterator" for the matrix. (You can think of the class this method lives in as a row iterator made up of individual column iterators, which are stored in vis
.) I'm not surprised that this method gets called a lot since it's an iterator, but I am surprised that the program spends a lot of its time in this method. This method doesn't do too much, so why is it spending its time here?
Here are the specific questions I'm asking:
vis
array is much smaller than the data of the matrices being multiplied.In case it's useful, here's a jad disassembly of the method I pasted above:
public double next()
{
double result;
if(hasNext())
//* 0 0:aload_0
//* 1 1:invokevirtual #88 <Method boolean hasNext()>
//* 2 4:ifeq 32
result = vis[i++].next();
// 3 7:aload_0
// 4 8:getfield #42 <Field VectorIterator[] vis>
// 5 11:aload_0
// 6 12:dup
// 7 13:getfield #28 <Field int i>
// 8 16:dup_x1
// 9 17:iconst_1
// 10 18:iadd
// 11 19:putfield #28 <Field int i>
// 12 22:aaload
// 13 23:invokeinterface #72 <Method double VectorIterator.next()>
// 14 28:dstore_1
else
//* 15 29:goto 42
throw new IllegalStateException("No next value");
// 16 32:new #89 <Class IllegalStateException>
// 17 35:dup
// 18 36:ldc1 #91 <String "No next value">
// 19 38:invokespecial #93 <Method void IllegalStateException(String)>
// 20 41:athrow
return result;
// 21 42:dload_1
// 22 43:dreturn
}
Thanks in advance for your help guys!
Overview. Hotspot analysis is a spatial analysis and mapping technique interested in the identification of clustering of spatial phenomena. These spatial phenomena are depicted as points in a map and refer to locations of events or objects.
“a hot spot is an area that has a greater than average number of criminal or disorder events, or an area where people have a higher than average risk of victimization.” An area can be considered a hotspot if a higher than average occurrence of the the event being analysed is found in a cluster.
Analysts use geographic information systems (GIS) to combine street maps, data about crime and public disorder, and data about other features such as schools, liquor stores, warehouses and bus stops. The resulting multidimensional maps produce a visual display of the hot spots.
A hot spot is an area on Earth over a mantle plume or an area under the rocky outer layer of Earth, called the crust, where magma is hotter than surrounding magma. The magma plume causes melting and thinning of the rocky crust and widespread volcanic activity.
I figured out that this method looked like a hotspot because VisualVM was being instructed to ignore methods from the JRE in its profiling. The time being spent in those "ignored" methods was (ostensibly) being rolled into the self time of the topmost non-ignored entry of the call stack.
Below is the settings screen in VisualVM, including the "Do not profile packages" setting that was making the data wrong. To adjust the "ignore class" settings, you have to (1) click the "Settings" checkbox highlighted in red, then (2) adjust the classes setting highlighted in blue.
Depending on what you're doing, it probably makes sense at least not to ignore the java.*
and javax.*
packages.
I don't know VisualVM from experience.
First determine if its instrumenting the bytecode to gather statistics. If so, look no further - instrumenting a short method always overinflates its self time (measuring the time and incrementing the statistics counter costs more time than the method itself).
But its always possible that the iterator consumes more time than the calculation itself. Imagine just summing up a matrix. Adding a float value to a local sum variable costs a lot less time than invoking a method, checking an invariant and finally accessing the array.
Forget the profiler. Just pause the darn thing a few times and examine the stack. If 85% of the time goes into that routine, then chances are 85% on each pause that you will see exactly where it is in that routine, and exactly where it comes from. You can even see where it is in the process of multiplying the matrices. Thousands of samples won't tell you that.
My own sense is that calling that function, then doing hasNext
, then doing Next
on every single element is going to be a lot slower than i++
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With