I am pretty new to java coming from a c++ world. I am running some server code that runs a method a method foo() that is called a few million times every second. This is latency sensitive code and the method also shows up in profiler as consuming 20% of overall cpu usage by process.
int foo_old() {
if (Float.isNan(this.x)) { // shows up in profiling
res = do some computation; // some floating point comparison, doesn't show up in profiling;
return res;
} else {
// Happens 99% of the time;
res = do something else; // some floating point comparison, doesn't show up in profiling;
return res;
}
}
Is there a easy way for me to test whether my method foo will be inlined or not ? can i know that from profiler stack trace information in a running server ?
I tried some optimization by trying to simplify method foo(). Basically there is a float.isNan check within foo , which also shows up on profiler , surprised to see nan checks are slower. compared to some other boolean operations (less than, greater than floating point comparison) faster.
One method i tried was to remove nan check , i.e because i know during compile time if an object needs nan check or not , i tried storing a functional interface (member variable) and assign this functional interface foo_old (which has nan check) OR foo_optimized (that doesn't do nan check) based on the object property known at the time of creation of object (In the constructor of object i assign this interface the correct method reference .
class A {
final FuncIf test; // Functional interface with same signature as foo_old, foo_new
public A(bool optimize) {
test = optimize ? this::foo_optimized : this::foo_old;
}
// same as the original foo mentioned above
int foo_old() {
...
}
// No nan check
int foo_optimized() {
res = do some computation;
return res;
}
}
Now when i create objects i know during compile time/object construction time which version of foo to use. so i assign interface variable to the correct version of foo. After deployment i observe latency infact increased by < 10% . Even though many of the objects now will actually use optimized version of foo.
IS it because foo before was a direct method call and as soon as i use a interface reference the extra indirection of dispatching virtual foo is the overhead i see in latency (The overhead of interface method call being much greater than the Nan check itself ?? ) ? Can't jvm compiler inline this interface method ?
The only educated guess would be to measure it's bytecode. Use javap for that. Basically the JVM has two compilers C1 and C2; both can inline that method.
There are three parameters that the JVM cares about when inlining (well, these are the ones I know of, I also know there are a lot more too):
-XX:MaxInlineSize (35 by default)
-XX:FreqInlineSize (325 by default)
-XX:MinInliningThreshold (250 by default)
If your method is called less than MinInliningThreshold (250), it obeys the MaxInlineSize rule, meaning that if it is smaller than 35 bytes, it will be inlined. If it is called more that that, than it obeys to FreqInlineSize, which is 325 bytes (a lot more).
What you can also do is print what is being inlined or not via some parameters:
-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining
As a result of running those you can see messages like:
callee is too large
this is printed by C1 and it tells you that MaxInlineSize is exceeded for that compiled method. Or:
too big
printed by C2 compiler when MaxInlineSize exceeded. Or :
hot method too big
printed by C2 when FreqInlineSize exceeded.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With