JMH - why do I need Blackhole.consumeCPU()

Tags:

blackhole

I'm trying to understand why it is wise to use Blackhole.consumeCPU() ?

Something I found about Blackhole.consumeCPU() on Google -->

Sometimes when we run run a benchmark across multiple threads we also want to burn some cpu cycles to simulate CPU business when running our code. This can't be a Thread.sleep as we really want to burn cpu. The Blackhole.consumeCPU(long) gives us the capability to do this.

My example code:

import java.util.concurrent.TimeUnit;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Level;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.Setup;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.infra.Blackhole;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.RunnerException;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;

@State(Scope.Thread)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
public class StringConcatAvgBenchmark {

StringBuilder stringBuilder1;
StringBuilder stringBuilder2;

StringBuffer stringBuffer1;
StringBuffer stringBuffer2;

String string1;
String string2;

/*
 * re-initializing the value after every iteration
 */
@Setup(Level.Iteration)
public void init() {
    stringBuilder1 = new StringBuilder("foo");
    stringBuilder2 = new StringBuilder("bar");

    stringBuffer1 = new StringBuffer("foo");
    stringBuffer2 = new StringBuffer("bar");

    string1 = new String("foo");
    string2 = new String("bar");

}

@Benchmark
@Warmup(iterations = 10)
@Measurement(iterations = 100)
@BenchmarkMode(Mode.AverageTime)
public StringBuilder stringBuilder() {
    // operation is very thin and so consuming some CPU
    Blackhole.consumeCPU(100);
    return stringBuilder1.append(stringBuilder2);
    // to avoid dead code optimization returning the value
}

@Benchmark
@Warmup(iterations = 10)
@Measurement(iterations = 100)
@BenchmarkMode(Mode.AverageTime)
public StringBuffer stringBuffer() {
    Blackhole.consumeCPU(100);      
    // to avoid dead code optimization returning the value
    return stringBuffer1.append(stringBuffer2);
}

@Benchmark
@Warmup(iterations = 10)
@Measurement(iterations = 100)
@BenchmarkMode(Mode.AverageTime)
public String stringPlus() {
    Blackhole.consumeCPU(100);      
    return string1 + string2;
}

@Benchmark
@Warmup(iterations = 10)
@Measurement(iterations = 100)
@BenchmarkMode(Mode.AverageTime)
public String stringConcat() {
    Blackhole.consumeCPU(100);      
    // to avoid dead code optimization returning the value
    return string1.concat(string2);
}

public static void main(String[] args) throws RunnerException {

    Options options = new OptionsBuilder()
            .include(StringConcatAvgBenchmark.class.getSimpleName())
            .threads(1).forks(1).shouldFailOnError(true).shouldDoGC(true)
            .jvmArgs("-server").build();
    new Runner(options).run();
}
}

Why are the results of this Benchmark better with the blackhole.consumeCPU(100) ?

EDIT:

Output with blackhole.consumeCPU(100):

Benchmark                      Mode  Cnt    Score    Error  Units
StringBenchmark.stringBuffer   avgt   10  398,843 ± 38,666  ns/op
StringBenchmark.stringBuilder  avgt   10  387,543 ± 40,087  ns/op
StringBenchmark.stringConcat   avgt   10  410,256 ± 33,194  ns/op
StringBenchmark.stringPlus     avgt   10  386,472 ± 21,704  ns/op

Output without blackhole.consumeCPU(100):

Benchmark                      Mode  Cnt   Score    Error  Units
StringBenchmark.stringBuffer   avgt   10  51,225 ± 19,254  ns/op
StringBenchmark.stringBuilder  avgt   10  49,548 ±  4,126  ns/op
StringBenchmark.stringConcat   avgt   10  50,373 ±  1,408  ns/op
StringBenchmark.stringPlus     avgt   10  87,942 ±  1,701  ns/op

My question was why the author of this code is using here blackhole.consumeCPU(100)

I think I know now why, because the Benchmarks are too quick without some delay.

With blackhole.consumeCPU(100) you can measure each benchmark better und receive more significant results.

Is that right ?

365

asked Mar 29 '16 15:03

1 Answers

Adding artificial delay would not normally improve the benchmark.

But, there are some cases where the operation you are measuring is contending over some resources, and you need a backoff that only consumes CPU, and hopefully does nothing else. See e.g. the case in : http://shipilev.net/blog/2014/nanotrusting-nanotime/

The benchmark in original question is not such a case, therefore I'd speculate Blackhole.consumeCPU is used there without a good reason, or at least that reason is not called out specifically in the comments. Don't do that.

124

answered Sep 28 '22 13:09

Aleksey Shipilev

Related questions
                            
                                How does dead code elimination of Math.log() work in JMH sample
                            
                                Java lock-free performance JMH
                            
                                Why is @GenerateMicroBenchmark missing in JMH and what is its replacement?
                            
                                Strange behavior in sun.misc.Unsafe.compareAndSwap measurement via JMH
                            
                                Do java caches results of the methods
                            
                                First warmup much faster than average [duplicate]
                            
                                How JMH measures execution time below granularity value?
                            
                                What does allocation rate means in JMH
                            
                                Control the order of methods using JMH
                            
                                How to use JMH properly? Example with ArrayList
                            
                                Direct java.nio.ByteBuffer vs Java Array Performance Test
                            
                                How to measure allocation rate with jmh?
                            
                                JMH: Returning the benchmark results as a json object
                            
                                Micro benchmarking a loop with different values in JMH
                            
                                Control number of operation per iteration JMH
                            
                                Why simple Scala tailrec loop for fibonacci calculation is faster in 3x times than Java loop?
                            
                                Java Increment benchmark [closed]
                            
                                Error: Could not find or load main class org.openjdk.jmh.runner.ForkedMain
                            
                                How to get rid of "Method parameters should be @State classes" in JMH when parameters come from another method?
                            
                                Why is getting a value from the end of a LinkedList much slower than from the start?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With