Is branch prediction not working?

Tags:

In reference to this question, the answer specify that the unsorted array takes more time because it fails the branch prediction test. but if we make a minor change in the program:

import java.util.Arrays;
import java.util.Random;


public class Main{

    public static void main(String[] args) {
        // Generate data
        int arraySize = 32768;
        int data[] = new int[arraySize];

        Random rnd = new Random(0);
        for (int c = 0; c < arraySize; ++c) {
            data[c] = rnd.nextInt() % 256;
        }

        // !!! With this, the next loop runs faster
        Arrays.sort(data);

        // Test
        long start = System.nanoTime();
        long sum = 0;

        for (int i = 0; i < 100000; ++i) {
            // Primary loop
            for (int c = 0; c < arraySize; ++c) {
                if (data[c] >= 128) {
                    sum = data[c];
                }
            }
        }

        System.out.println((System.nanoTime() - start) / 1000000000.0);
        System.out.println("sum = " + sum);
    }
}

here I have replaced (from original question)

if (data[c] >= 128) 
    sum += data[c];

with

if (data[c] >= 128) 
    sum = data[c];

the unsorted array gives approx. the same result, I want to ask why branch prediction is not working in this case?

256

asked Jan 29 '14 13:01

MYK

1 Answers

I have used jmh to analyze this. Here is my code:

@OutputTimeUnit(TimeUnit.MICROSECONDS)
@BenchmarkMode(Mode.AverageTime)
@Warmup(iterations = 2, time = 1)
@Measurement(iterations = 3, time = 1)
@State(Scope.Thread)
@Fork(2)
public class Comparison
{
  static final int SIZE = 1<<15;
  final int[] data = new int[SIZE];

  @Setup
  public void setup() {
    int i = 1;
    for (int c = 0; c < SIZE; ++c) data[c] = (i*=611953);
    for (int c = 0; c < SIZE; ++c) data[c] = data[c] >= 128? 128 : 127;
  }

  @GenerateMicroBenchmark
  public long sum() {
    long sum = 0;
    for (int c = 0; c < SIZE; ++c) if (data[c] >= 128) sum += data[c];
    return sum;
  }
}

Notice I don't use either sorting or random number generation; they are an unnecessary complication. With the formula used in the above code:

data[c] = (i*=611953);

I get 132 µs runtime. If I comment out the line involving

data[c] = data[c] >= 128? 128 : 127;

the time doesn't change at all. This eliminates all arithmetic considerations and focuses on branch prediction. If I use

data[c] = 127;

I get 13 µs, and if I use

data[c] = 128;

I get 16 µs. That's the "base case", stressing the difference between constant branching decisions.

My conclusion: this is definitely the effect of low-level branch prediction.

Could the JIT reverse the loop?

Let's analyze your intervention now. If I use the formula as presented in my code above, but change

if (data[c] >= 128) sum += data[c];

if (data[c] >= 128) sum = data[c];

then the timing indeed drops from 132 µs to 27 µs.

This is my guess at explaining the drop: an optimizing trick the JIT compiler can do is to reverse the direction of the loop. Now your code becomes

for (int c = SIZE-1; c <= 0; --c) if (data[c] >= 128) { sum = data[c]; break; }

the loop has been short-circuited to the minimum number of iterations needed to reach the same outcome as the original loop.

I added this

data[SIZE-1] = 128;

to the end of the setup() method, but it didn't change the timing. That would seem to invalidate the naïve version of the "loop reversal" conjecture.

No, it's probably `cmovl`

In analyzing assembly I find this:

cmp edx, 0x80
cmovl eax, ebx

cmovl is a conditional move instruction which will perform the effect of the assignment happening in the then branch, but without involving any jumps, therefore eliminating any penalty associated with branch prediction failure. This is a good explanation of the actual effect.

174

answered Oct 08 '22 02:10

Marko Topolnik

Related questions
                            
                                AES encrypt with openssl decrypt using java
                            
                                What is the right way to sign POST requests with OAuth-Signpost and Apache HttpComponents?
                            
                                Extracting webpage information based on a template in Java
                            
                                mvn jetty:run does not find my LoginService
                            
                                getResourceAsStream filepath while running .jar
                            
                                how push notification (java/ servlet) for web application?
                            
                                Is reordering of instance initialization and assignment to a shared variable possible?
                            
                                How to represent generic parameter in UML method?
                            
                                Tomcat process killed by Linux kernel after running out of swap space; don't get any JVM OutOfMemory error
                            
                                What does @Transactional do? [duplicate]
                            
                                Writing a regex to detect repeat-characters [duplicate]
                            
                                what is the purpose of javax StreamSource
                            
                                JPA Criteria API: LEFT JOIN for optional relationships
                            
                                Map clear vs null
                            
                                Spring PageableArgumentResolver deprecated, how to use PageableHandlerMethodArgumentResolver?
                            
                                contextstoppedevent vs contextclosedevent in spring?
                            
                                Java 8 MapReduce for distributed computing
                            
                                spring mvc controller accept JSON object with variable number of keys/values
                            
                                Where can I find google-play-services_lib.jar?
                            
                                Cassandra Java Driver: How are insert, update, and delete results reported?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is branch prediction not working?

Tags:

java

performance

branch-prediction

microbenchmark

jmh

MYK

People also ask

1 Answers

Could the JIT reverse the loop?

No, it's probably `cmovl`

Marko Topolnik

Recent Activity

Donate For Us

Is branch prediction not working?

Tags:

java

performance

branch-prediction

microbenchmark

jmh

MYK

People also ask

1 Answers

Could the JIT reverse the loop?

No, it's probably cmovl

Marko Topolnik

Related questions

Recent Activity

Donate For Us

No, it's probably `cmovl`