I have been reading Effective Java, 3/E.
While reading the part regarding hashcode, (page 51) I noticed the book saying
A nice property of 31 is that the multiplication can be replaced by a shift and a subtraction for better performance on some architectures:
31 * i == (i << 5) - i
. Modern VMs do this sort of optimization automatically.
I thought this makes sense. And I wondered how much faster would the code become when this kind of optimization took place. So I wrote a short code to see the impact of such optimizations.
But, it seemed like there were no noticeable differences. So I wrote much simpler code, to check if such optimization even took place.
Below is my sample code.
fun main() {
val num = Random.nextInt()
val a = num * 30
val b = num * 31
val c = num * 32
println("$a, $b, $c")
}
And this is the compiled machine code, got from IntelliJ's Kotlin bytecode feature.
L1
LINENUMBER 5 L1
ILOAD 0
BIPUSH 30
IMUL
ISTORE 1
L2
LINENUMBER 6 L2
ILOAD 0
BIPUSH 31
IMUL
ISTORE 2
L3
LINENUMBER 7 L3
ILOAD 0
BIPUSH 32
IMUL
ISTORE 3
Apparently, there are no differences. We push each number and just call IMUL
. I thought maybe the optimization takes place when the Java bytecode is compiled into actual machine code, but I have never checked that side, so I do not know how to confirm my theory. I searched, and it seems like the keyword I am looking for is the JIT compiler, which seemingly converts the .class into cpu-specific machine code.
I thought, maybe I can try to convert this code to cpu-specific machine code through JIT compiler, but then that means I checked this theory on one specific CPU, not all CPUs. I want to see if it is "generally true", but that will take too much time.
So, is there any way to confirm that above code is actually (generally) being optimized by the compiler? If I have similar questions in the future, where should I look for? I mean, when I'm curious about java behavior, I go the oracle and check JVM reference or java se reference. But what about the compiler behavior? Where should I start?
That was a long question. Thank you for spending your precious time on reading this question.
(just an additional note)
I checked C and python on https://godbolt.org/, and confirmed that for C, it is actually optimized.
int test(int num) {
int n = rand();
int a= n*30;
int b= n*31;
int c= n*32;
return a * b * c;
}
test:
push rbp
mov rbp, rsp
sub rsp, 32
mov DWORD PTR [rbp-20], edi
call rand
mov DWORD PTR [rbp-4], eax
mov eax, DWORD PTR [rbp-4]
imul eax, eax, 30
mov DWORD PTR [rbp-8], eax
mov edx, DWORD PTR [rbp-4]
mov eax, edx
sal eax, 5
sub eax, edx
mov DWORD PTR [rbp-12], eax
mov eax, DWORD PTR [rbp-4]
sal eax, 5
mov DWORD PTR [rbp-16], eax
mov eax, DWORD PTR [rbp-8]
imul eax, DWORD PTR [rbp-12]
imul eax, DWORD PTR [rbp-16]
leave
ret
However, python, was not.
num = randint()
a = num * 30
b = num * 31
c = num * 32
5 18 LOAD_NAME 2 (num)
20 LOAD_CONST 2 (30)
22 BINARY_MULTIPLY
24 STORE_NAME 3 (a)
6 26 LOAD_NAME 2 (num)
28 LOAD_CONST 3 (31)
30 BINARY_MULTIPLY
32 STORE_NAME 4 (b)
7 34 LOAD_NAME 2 (num)
36 LOAD_CONST 4 (32)
38 BINARY_MULTIPLY
40 STORE_NAME 5 (c)
42 LOAD_CONST 5 (None)
44 RETURN_VALUE
The compiler don't optimize the bytecode because it is optimized at run time by the JIT optimizer. If the type of runtime you are targeting don't have a JIT optimizer (even if it had a JIT compiler), or you are AOT compiling, I recommend using an optimizing obfuscator like Proguard or Allatori.
The JIT compiler helps improve the performance of Java programs by compiling bytecodes into native machine code at run time. The JIT compiler is enabled by default. When a method has been compiled, the JVM calls the compiled code of that method directly instead of interpreting it.
The java.math.BigInteger .multiply (BigInteger val) is used to calculate the multiplication of two BigIntegers. As BigInteger class internally uses an array of integers for processing, the operation on an object of BigInteger are not as fast as on primitives. Attention reader!
Because we are multiplying numbers, there is a risk that the resulting number will be larger than our data type can hold. Even though int and long data types can hold some fairly large numbers, once you start multiplying, the chances of overflow increase dramatically. Java provides a core class called Math; with this class are several methods.
A mathematical approach of optimizing integer multiplications for slow MUL instructions and known constants. Robert Eisele Computer Science & Machine Learning Home About Archive Projects Optimizing integer multiplication
Learn the various methods for multiplying in Java, including how to combine statements, manage overflow, and doing double/float/long conversions. Updated: 01/12/2022 Java provides several arithmetic operations that you can use in your programs. The language supports statements from the very simple to the incredibly complex.
Languages like C are compiled Ahead-Of-Time, which means all optimizations are done in compile time since they are compiled to assembly code and interpreted by the local machine.
Kotlin, Scala, Java, etc. JVM languages runs on the Java Virtual Machine. Implementations of the JVM does runtime optimizations. This is called Just-In-Time Compilation. An example of a JIT compiler is HotSpot, like its name it finds “hot spots” of JVM code and compile and optimize it to assembly. An alternative JIT to HotSpot is OpenJ9.
Languages like Python, I believe, are interpreted at runtime which means there are no optimizations involved at all. But AOT compilers for python might actually do some optimizations, but I don't really know the details of the implementations of these compilers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With