Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is microbenchmarking?

I've heard this term used, but I'm not entirely sure what it means, so:

  • What DOES it mean and what DOESN'T it mean?
  • What are some examples of what IS and ISN'T microbenchmarking?
  • What are the dangers of microbenchmarking and how do you avoid it?
    • (or is it a good thing?)
like image 424
polygenelubricants Avatar asked May 16 '10 05:05

polygenelubricants


People also ask

What is Java Microbenchmarking?

JMH is a Java harness for building, running, and analysing nano/micro/milli/macro benchmarks written in Java and other languages targeting the JVM.

What is the Microbenchmark package useful for?

The microbenchmark package is useful for running small sections of code to assess performance, as well as for comparing the speed of several functions that do the same thing.

How do you run JMH?

There are two ways to run the JMH benchmark, uses Maven or run it via a JMH Runner class directly. 3.1 Maven, package it as a JAR and run it via org. openjdk.

How do you benchmark a spring boot application?

The solution was quite than easy than I thought. The important part is to start the spring-boot application when the benchmark is getting initialized. Define a class level variable for configuration context and give a reference to it during setup of the benchmark. Make a call to the bean method inside the benchmark.


1 Answers

It means exactly what it says on the tin can - it's measuring the performance of something "small", like a system call to the kernel of an operating system.

The danger is that people may use whatever results they obtain from microbenchmarking to dictate optimizations. And as we all know:

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil" -- Donald Knuth

There can be many factors that skew the result of microbenchmarks. Compiler optimizations is one of them. If the operation being measured takes so little time that whatever you use to measure it takes longer than the actual operation itself, your microbenchmarks will be skewed also.

For example, someone might take a microbenchmark of the overhead of for loops:

void TestForLoop() {     time start = GetTime();      for(int i = 0; i < 1000000000; ++i)     {     }      time elapsed = GetTime() - start;     time elapsedPerIteration = elapsed / 1000000000;     printf("Time elapsed for each iteration: %d\n", elapsedPerIteration); } 

Obviously compilers can see that the loop does absolutely nothing and not generate any code for the loop at all. So the value of elapsed and elapsedPerIteration is pretty much useless.

Even if the loop does something:

void TestForLoop() {     int sum = 0;     time start = GetTime();      for(int i = 0; i < 1000000000; ++i)     {         ++sum;     }      time elapsed = GetTime() - start;     time elapsedPerIteration = elapsed / 1000000000;     printf("Time elapsed for each iteration: %d\n", elapsedPerIteration); } 

The compiler may see that the variable sum isn't going to be used for anything and optimize it away, and optimize away the for loop as well. But wait! What if we do this:

void TestForLoop() {     int sum = 0;     time start = GetTime();      for(int i = 0; i < 1000000000; ++i)     {         ++sum;     }      time elapsed = GetTime() - start;     time elapsedPerIteration = elapsed / 1000000000;     printf("Time elapsed for each iteration: %d\n", elapsedPerIteration);     printf("Sum: %d\n", sum); // Added } 

The compiler might be smart enough to realize that sum will always be a constant value, and optimize all that away as well. Many would be surprised at the optimizing capabilities of compilers these days.

But what about things that compilers can't optimize away?

void TestFileOpenPerformance() {     FILE* file = NULL;     time start = GetTime();      for(int i = 0; i < 1000000000; ++i)     {         file = fopen("testfile.dat");         fclose(file);     }      time elapsed = GetTime() - start;     time elapsedPerIteration = elapsed / 1000000000;     printf("Time elapsed for each file open: %d\n", elapsedPerIteration); } 

Even this is not a useful test! The operating system may see that the file is being opened very frequently, so it may preload it in memory to improve performance. Pretty much all operating systems do this. The same thing happens when you open applications - operating systems may figure out the top ~5 applications you open the most and preload the application code in memory when you boot up the computer!

In fact, there are countless variables that come into play: locality of reference (e.g. arrays vs. linked lists), effects of caches and memory bandwidth, compiler inlining, compiler implementation, compiler switches, number of processor cores, optimizations at the processor level, operating system schedulers, operating system background processes, etc.

So microbenchmarking isn't exactly a useful metric in a lot of cases. It definitely does not replace whole-program benchmarks with well-defined test cases (profiling). Write readable code first, then profile to see what needs to be done, if any.

I would like to emphasize that microbenchmarks are not evil per se, but one has to use them carefully (that's true for lots of other things related to computers)

like image 151
9 revs, 2 users 99% Avatar answered Oct 09 '22 22:10

9 revs, 2 users 99%