I want to benchmark this simple C code: <pre class="prettyprint"><code>float f(float x[], float y[]) { float p = 0; for (int i = 0; i <64; i++) p += x[i] * y[i]; return p; } </code></pre> My motivation is to try different compiler flags and also gcc and clang to see what difference it makes. I found this test framework and have been trying to get it to work. Although I am completely new to C++, here is my best effort: <pre class="prettyprint"><code>#include <benchmark.h> #include <benchmark_api.h> #include <cstdio> #include <random> std::random_device seed; std::mt19937 gen(seed()); float f(float* x, float* y) { float p = 0; for (int i = 0; i <64; i++) { p += x[i] * y[i]; } return p; } void f_benchmark(benchmark::State& state) { while (state.KeepRunning()) { benchmark::DoNotOptimize(f((float*) state.range(0), (float*) state.range(1))); } } void args(benchmark::internal::Benchmark* b) { std::uniform_real_distribution<float> rand(0, 100); for (int i = 0; i < 10; i++) { float* x = new float[64]; float* y = new float[64]; for (int i = 0; i < 64; i++) { x[i] = rand(gen); y[i] = rand(gen); printf("%f %f\n", x[i], y[i]); } b->Args({(int) x, (int) y}); } } BENCHMARK(f_benchmark)->Apply(args); BENCHMARK_MAIN(); </code></pre> To compile it I do: <blockquote> g++ -Ofast -Wall -std=c++11 test.cpp -Ibenchmark/include/benchmark/ -Lbenchmark/src/ -o test -lbenchmark -lpthread </blockquote> This gives me : <pre class="prettyprint"><code>test.cpp: In function ‘void f_benchmark(benchmark::State&)’: test.cpp:20:54: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast] benchmark::DoNotOptimize(f((float*) state.range(0), (float*) state.range(1))); [...] test.cpp: In function ‘void args(benchmark::internal::Benchmark*)’: test.cpp:38:20: error: cast from ‘float*’ to ‘int’ loses precision [-fpermissive] b->Args({(int) x, (int) y}); ^ [...] </code></pre> <blockquote> How can I get rid of those warnings and in general, am I doing this right? </blockquote>

Your code casts a <code>float*</code> to <code>int</code> and back to a <code>float*</code> - this can cause problems, because <code>sizeof(int)</code> and <code>sizeof(float*)</code> are not guaranteed to be identical (i.e. on x86-64 <code>int</code> is 32bit, while <code>float*</code> is 64bit!). The reason why you run into this issue, is probably because <code>Args()</code> supports only <code>int</code> arguments (they're supposed to be used as index for a family of benchmarks, not as actual function arguments in your function). To use parameters of a different type you could: A. use global variables to store the pre-calculated random array i.e. <pre class="prettyprint"><code>#include <benchmark.h> #include <benchmark_api.h> #include <cstdio> #include <random> std::random_device seed; std::mt19937 gen(seed()); float x[64*10], y[64*10]; float f(float* x, float* y) { float p = 0; for (int i = 0; i <64; i++) { p += x[i] * y[i]; } return p; } void f_benchmark(benchmark::State& state) { while (state.KeepRunning()) { benchmark::DoNotOptimize(f(&x[state.range(0)*64], &y[state.range(0)*64])); } } void args(benchmark::internal::Benchmark* b) { std::uniform_real_distribution<float> rand(0, 100); for (int i = 0; i < 64*10; i++) { x[i] = rand(gen); y[i] = rand(gen); } for (int i = 0; i < 10; ++i) b->Arg({ i }); } BENCHMARK(f_benchmark)->Apply(args); BENCHMARK_MAIN(); </code></pre> B. calculate the random numbers as part of the benchmark function (choose this approach if, you really require different random values for each iteration - the timing needs to be paused / resumed accordingly to not include the time for the random generation/memory allocation in the benchmark) i.e. <pre class="prettyprint"><code>#include <benchmark.h> #include <benchmark_api.h> #include <cstdio> #include <random> std::random_device seed; std::mt19937 gen(seed()); float f(float* x, float* y) { float p = 0; for (int i = 0; i <64; i++) { p += x[i] * y[i]; } return p; } void f_benchmark(benchmark::State& state) { state.PauseTiming(); std::uniform_real_distribution<float> rand(0, 100); float* x = new float[64]; float* y = new float[64]; while (state.KeepRunning()) { for (int i = 0; i < 64; i++) { x[i] = rand(gen); y[i] = rand(gen); } state.ResumeTiming(); benchmark::DoNotOptimize(f(x, y)); state.PauseTiming(); } delete[] x; delete[] y; } BENCHMARK(f_benchmark)->Apply([](benchmark::internal::Benchmark* b){ for (int i = 0; i < 10; ++i) b->Arg({ i }); }); BENCHMARK_MAIN(); </code></pre> Side note: Also take care about the the leaking memory in your <code>for</code> loop - you should call the <code>delete[]</code> operator once for every <code>new[]</code> operator.

Problems benchmarking simple code with googlebenchmark

Tags:

c++

I want to benchmark this simple C code:

float f(float x[], float y[]) {
  float p = 0;
  for (int i = 0; i <64; i++)
    p += x[i] * y[i];
  return p;
}

My motivation is to try different compiler flags and also gcc and clang to see what difference it makes.

I found this test framework and have been trying to get it to work. Although I am completely new to C++, here is my best effort:

#include <benchmark.h>
#include <benchmark_api.h>

#include <cstdio>
#include <random>

std::random_device seed;
std::mt19937 gen(seed());

float f(float* x, float* y) {
  float p = 0;
  for (int i = 0; i <64; i++) {
    p += x[i] * y[i];
  }
  return p;
}

void f_benchmark(benchmark::State& state) {
  while (state.KeepRunning()) {
    benchmark::DoNotOptimize(f((float*) state.range(0), (float*) state.range(1)));
  }
}

void args(benchmark::internal::Benchmark* b) {
  std::uniform_real_distribution<float> rand(0, 100);

  for (int i = 0; i < 10; i++) {
    float* x = new float[64];
    float* y = new float[64];

    for (int i = 0; i < 64; i++) {
      x[i] = rand(gen);
      y[i] = rand(gen);

      printf("%f %f\n", x[i], y[i]);
    }

    b->Args({(int) x, (int) y});
  }
}

BENCHMARK(f_benchmark)->Apply(args);

BENCHMARK_MAIN();

To compile it I do:

g++ -Ofast -Wall -std=c++11 test.cpp -Ibenchmark/include/benchmark/ -Lbenchmark/src/ -o test -lbenchmark -lpthread

This gives me :

test.cpp: In function ‘void f_benchmark(benchmark::State&)’:
test.cpp:20:54: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
     benchmark::DoNotOptimize(f((float*) state.range(0), (float*) state.range(1)));
[...]                                                                            
test.cpp: In function ‘void args(benchmark::internal::Benchmark*)’:
test.cpp:38:20: error: cast from ‘float*’ to ‘int’ loses precision [-fpermissive]
     b->Args({(int) x, (int) y});
                    ^
[...]

How can I get rid of those warnings and in general, am I doing this right?

744

asked Feb 14 '17 20:02

graffe

1 Answers

Your code casts a float* to int and back to a float* - this can cause problems, because sizeof(int) and sizeof(float*) are not guaranteed to be identical (i.e. on x86-64 int is 32bit, while float* is 64bit!). The reason why you run into this issue, is probably because Args() supports only int arguments (they're supposed to be used as index for a family of benchmarks, not as actual function arguments in your function). To use parameters of a different type you could:

A. use global variables to store the pre-calculated random array i.e.

#include <benchmark.h>
#include <benchmark_api.h>

#include <cstdio>
#include <random>

std::random_device seed;
std::mt19937 gen(seed());

float x[64*10], y[64*10];

float f(float* x, float* y) {
  float p = 0;
  for (int i = 0; i <64; i++) {
    p += x[i] * y[i];
  }
  return p;
}

void f_benchmark(benchmark::State& state) {
  while (state.KeepRunning()) {
    benchmark::DoNotOptimize(f(&x[state.range(0)*64], &y[state.range(0)*64]));
  }
}

void args(benchmark::internal::Benchmark* b) {
  std::uniform_real_distribution<float> rand(0, 100);
  for (int i = 0; i < 64*10; i++) {
    x[i] = rand(gen);
    y[i] = rand(gen);
  }
  for (int i = 0; i < 10; ++i)
    b->Arg({ i });
}
BENCHMARK(f_benchmark)->Apply(args);
BENCHMARK_MAIN();

B. calculate the random numbers as part of the benchmark function (choose this approach if, you really require different random values for each iteration - the timing needs to be paused / resumed accordingly to not include the time for the random generation/memory allocation in the benchmark) i.e.

#include <benchmark.h>
#include <benchmark_api.h>
#include <cstdio>
#include <random>

std::random_device seed;
std::mt19937 gen(seed());

float f(float* x, float* y) {
  float p = 0;
  for (int i = 0; i <64; i++) {
    p += x[i] * y[i];
  }
  return p;
}

void f_benchmark(benchmark::State& state) {
  state.PauseTiming();
  std::uniform_real_distribution<float> rand(0, 100);
  float* x = new float[64];
  float* y = new float[64];
  while (state.KeepRunning()) {
    for (int i = 0; i < 64; i++) {
      x[i] = rand(gen);
      y[i] = rand(gen);
    }
    state.ResumeTiming();
    benchmark::DoNotOptimize(f(x, y));
    state.PauseTiming();
  }
  delete[] x;
  delete[] y;
}


BENCHMARK(f_benchmark)->Apply([](benchmark::internal::Benchmark* b){
  for (int i = 0; i < 10; ++i)
    b->Arg({ i });
});

BENCHMARK_MAIN();

Side note: Also take care about the the leaking memory in your for loop - you should call the delete[] operator once for every new[] operator.

146

answered Sep 30 '22 02:09

Constantin

Related questions
                            
                                Function template overload resolution using const references
                            
                                Overload resolution and explicit template arguments
                            
                                Eigen and SVD to find Best Fitting Plane given a Set of Points
                            
                                Does a C++ cast strip the 'extern "C"' from a declaration?
                            
                                How to suppress inlining of templates with gcov
                            
                                How can I make use of intel-mkl with tensorflow
                            
                                Do parentheses make a pointer template argument invalid?
                            
                                How can I use variadic template in C++ while keeping my implementor class private?
                            
                                A separate loop slows down an independent earlier loop?
                            
                                How to pass IntPtr to method from unmanaged C++ CLR hosting code?
                            
                                Can we delete an object passed as by reference?
                            
                                using-declaration for friend function
                            
                                unique_ptr: linked list entry deletion
                            
                                C++ Rule of Zero & what is "user-declared" constructor?
                            
                                C++ inplace destructor compile warning
                            
                                How does clang's uint24_t work? Can I use it outside clang/LLVM?
                            
                                Frequency Shifter Using FFT
                            
                                What are examples of test suite property uses in CPPUnit? (CPPUNIT_TEST_SUITE_PROPERTY)
                            
                                Is shufps slower than memory access?
                            
                                SIGSEGV using Eigen and std::vector

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With