I'm comparing the efficiency of nested for, while and do-while loops in Java, and I've come across some strange results that I need help understanding.
public class Loops {
public static void main(String[] args) {
int L = 100000; // number of iterations per loop
// for loop
double start = System.currentTimeMillis();
long s1 = 0;
for (int i=0; i < L; i++) {
for (int j = 0; j < L; j++) {
s1 += 1;
}
}
double end = System.currentTimeMillis();
String result1 = String.format("for loop: %.5f", (end-start) / 1000);
System.out.println(s1);
System.out.println(result1);
// do-while loop
double start1 = System.currentTimeMillis();
int i = 0;
long s2 = 0;
do {
i++;
int j = 0;
do {
s2 += 1;
j++;
} while (j < L);
} while (i < L);
double end1 = System.currentTimeMillis();
String result2 = String.format("do-while: %.5f", (end1-start1) / 1000);
System.out.println(s2);
System.out.println(result2);
// while loop
double start2 = System.currentTimeMillis();
i = 0;
long s3 = 0;
while (i < L) {
i++;
int j = 0;
while (j < L) {
s3 += 1;
j++;
}
}
double end2 = System.currentTimeMillis();
String result3 = String.format("while: %.5f", (end2-start2) / 1000);
System.out.println(s3);
System.out.println(result3);
}
}
All of the loops respective counters sum to 10 billion; the results perplex me:
for loop: 6.48300
do-while: 0.41200
while: 9.71500
Why is the do-while loop so much quicker? This performance gap scales in parallel with any changes to L. I've run these loops independently and they perform the same.
The do-while loop is doing fewer comparisons than the for and while loops.
The for loop method is faster when using ArrayList because the for-each is implemented by the iterator, and it needs to perform concurrent modification verification.
I have run the code you have provided and also was surprised to see these differences in performance. Lead by curiosity I started investigating and found out that indeed despite these loops seem to be doing the same thing there are some important differences between them.
My results after the first run of these loops were:
for loop: 1.43100
do-while: 0.51300
while: 1.54500
But when I run these three loops at least 10 times then the performance of each of these loop was pretty much the same.
for loop: 0.43200
do-while: 0.46100
while: 0.42900
The JIT is able to optimize these loops over time, but there must be some dissimilarity causing these loops to have a different initial performance. In fact there are actually two differences:
do-while
loop is doing fewer comparisons than the for
and while
loopsFor simplicity assume L = 1
long s1 = 0;
for (int i=0; i < L; i++) {
for (int j = 0; j < L; j++) {
s1 += 1;
outer loop: 0 < 1
inner loop: 0 < 1
inner loop: 1 < 1
outer loop: 1 < 1
4 comparisons in total
int i = 0;
long s2 = 0;
do {
i++;
int j = 0;
do {
s2 += 1;
j++;
} while (j < L);
} while (i < L);
inner loop: 1 < 1
outer loop: 1 < 1
2 comparisons in total
For the purpose of further investigation I have changed your class slightly, not impacting the way it works.
public class Loops {
final static int L = 100000; // number of iterations per loop
public static void main(String[] args) {
int round = 10;
while (round-- > 0) {
forLoop();
doWhileLoop();
whileLoop();
}
}
private static long whileLoop() {
int i = 0;
long s3 = 0;
while (i++ < L) {
int j = 0;
while (j++ < L) {
s3 += 1;
}
}
return s3;
}
private static long doWhileLoop() {
int i = 0;
long s2 = 0;
do {
int j = 0;
do {
s2 += 1;
} while (++j < L);
} while (++i < L);
return s2;
}
private static long forLoop() {
long s1 = 0;
for (int i = 0; i < L; i++) {
for (int j = 0; j < L; j++) {
s1 += 1;
}
}
return s1;
}
}
Then compiled it and invoked javap -c -s -private -l Loop
to get the bytecode.
First the bytecode of doWhileLoop.
0: iconst_0 // push the int value 0 onto the stack
1: istore_1 // store int value into variable 1 (i)
2: lconst_0 // push the long 0 onto the stack
3: lstore_2 // store a long value in a local variable 2 (s2)
4: iconst_0 // push the int value 0 onto the stack
5: istore 4 // store int value into variable 4 (j)
7: lload_2 // load a long value from a local variable 2 (i)
8: lconst_1 // push the long 1 onto the stack
9: ladd // add two longs
10: lstore_2 // store a long value in a local variable 2 (i)
11: iinc 4, 1 // increment local variable 4 (j) by signed byte 1
14: iload 4 // load an int value from a local variable 4 (j)
16: iload_0 // load an int value from a local variable 0 (L)
17: if_icmplt 7 // if value1 is less than value2, branch to instruction at 7
20: iinc 1, 1 // increment local variable 1 (i) by signed byte 1
23: iload_1 // load an int value from a local variable 1 (i)
24: iload_0 // load an int value from a local variable 0 (L)
25: if_icmplt 4 // if value1 is less than value2, branch to instruction at 4
28: lload_2 // load a long value from a local variable 2 (s2)
29: lreturn // return a long value
Now the bytecode of whileLooP:
0: iconst_0 // push int value 0 onto the stack
1: istore_1 // store int value into variable 1 (i)
2: lconst_0 // push the long 0 onto the stack
3: lstore_2 // store a long value in a local variable 2 (s3)
4: goto 26
7: iconst_0 // push the int value 0 onto the stack
8: istore 4 // store int value into variable 4 (j)
10: goto 17
13: lload_2 // load a long value from a local variable 2 (s3)
14: lconst_1 // push the long 1 onto the stack
15: ladd // add two longs
16: lstore_2 // store a long value in a local variable 2 (s3)
17: iload 4 // load an int value from a local variable 4 (j)
19: iinc 4, 1 // increment local variable 4 (j) by signed byte 1
22: iload_0 // load an int value from a local variable 0 (L)
23: if_icmplt 13 // if value1 is less than value2, branch to instruction at 13
26: iload_1 // load an int value from a local variable 1 (i)
27: iinc 1, 1 // increment local variable 1 by signed byte 1
30: iload_0 // load an int value from a local variable 0 (L)
31: if_icmplt 7 // if value1 is less than value2, branch to instruction at 7
34: lload_2 // load a long value from a local variable 2 (s3)
35: lreturn // return a long value
To make the output more readable I have append comments describing what each instruction does based on the Java bytecode instruction listings.
If you take a closer look you will see that there is a important difference between these two bytecodes.
The while loop (the same is true for the for loop) has the if statements (if_icmplt
instruction) defined at the end of the bytecode. Which means that to check the exit condition of the first loop a goto to line 26 has to be invoked and similarly a goto to line 17 for the second loop.
The above bytecode was generated using javac 1.6.0_45 on Mac OS X.
Summary
I think the different amount of comparisons plus the existence of goto instructions in the while and for loop bytecode is responsible for the performance difference between these loops.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With