I want to benchmark this simple C code:
float f(float x[], float y[]) {
float p = 0;
for (int i = 0; i <64; i++)
p += x[i] * y[i];
return p;
}
My motivation is to try different compiler flags and also gcc and clang to see what difference it makes.
I found this test framework and have been trying to get it to work. Although I am completely new to C++, here is my best effort:
#include <benchmark.h>
#include <benchmark_api.h>
#include <cstdio>
#include <random>
std::random_device seed;
std::mt19937 gen(seed());
float f(float* x, float* y) {
float p = 0;
for (int i = 0; i <64; i++) {
p += x[i] * y[i];
}
return p;
}
void f_benchmark(benchmark::State& state) {
while (state.KeepRunning()) {
benchmark::DoNotOptimize(f((float*) state.range(0), (float*) state.range(1)));
}
}
void args(benchmark::internal::Benchmark* b) {
std::uniform_real_distribution<float> rand(0, 100);
for (int i = 0; i < 10; i++) {
float* x = new float[64];
float* y = new float[64];
for (int i = 0; i < 64; i++) {
x[i] = rand(gen);
y[i] = rand(gen);
printf("%f %f\n", x[i], y[i]);
}
b->Args({(int) x, (int) y});
}
}
BENCHMARK(f_benchmark)->Apply(args);
BENCHMARK_MAIN();
To compile it I do:
g++ -Ofast -Wall -std=c++11 test.cpp -Ibenchmark/include/benchmark/ -Lbenchmark/src/ -o test -lbenchmark -lpthread
This gives me :
test.cpp: In function ‘void f_benchmark(benchmark::State&)’:
test.cpp:20:54: warning: cast to pointer from integer of different size [-Wint-to-pointer-cast]
benchmark::DoNotOptimize(f((float*) state.range(0), (float*) state.range(1)));
[...]
test.cpp: In function ‘void args(benchmark::internal::Benchmark*)’:
test.cpp:38:20: error: cast from ‘float*’ to ‘int’ loses precision [-fpermissive]
b->Args({(int) x, (int) y});
^
[...]
How can I get rid of those warnings and in general, am I doing this right?
Based on the levels of performance they measure, benchmarks can be grouped into two levels: Component-level Benchmarks. System-level Benchmarks.
In computing, a benchmark is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an object, normally by running a number of standard tests and trials against it.
Your code casts a float*
to int
and back to a float*
- this can cause problems, because sizeof(int)
and sizeof(float*)
are not guaranteed to be identical (i.e. on x86-64 int
is 32bit, while float*
is 64bit!).
The reason why you run into this issue, is probably because Args()
supports only int
arguments (they're supposed to be used as index for a family of benchmarks, not as actual function arguments in your function). To use parameters of a different type you could:
A. use global variables to store the pre-calculated random array i.e.
#include <benchmark.h>
#include <benchmark_api.h>
#include <cstdio>
#include <random>
std::random_device seed;
std::mt19937 gen(seed());
float x[64*10], y[64*10];
float f(float* x, float* y) {
float p = 0;
for (int i = 0; i <64; i++) {
p += x[i] * y[i];
}
return p;
}
void f_benchmark(benchmark::State& state) {
while (state.KeepRunning()) {
benchmark::DoNotOptimize(f(&x[state.range(0)*64], &y[state.range(0)*64]));
}
}
void args(benchmark::internal::Benchmark* b) {
std::uniform_real_distribution<float> rand(0, 100);
for (int i = 0; i < 64*10; i++) {
x[i] = rand(gen);
y[i] = rand(gen);
}
for (int i = 0; i < 10; ++i)
b->Arg({ i });
}
BENCHMARK(f_benchmark)->Apply(args);
BENCHMARK_MAIN();
B. calculate the random numbers as part of the benchmark function (choose this approach if, you really require different random values for each iteration - the timing needs to be paused / resumed accordingly to not include the time for the random generation/memory allocation in the benchmark) i.e.
#include <benchmark.h>
#include <benchmark_api.h>
#include <cstdio>
#include <random>
std::random_device seed;
std::mt19937 gen(seed());
float f(float* x, float* y) {
float p = 0;
for (int i = 0; i <64; i++) {
p += x[i] * y[i];
}
return p;
}
void f_benchmark(benchmark::State& state) {
state.PauseTiming();
std::uniform_real_distribution<float> rand(0, 100);
float* x = new float[64];
float* y = new float[64];
while (state.KeepRunning()) {
for (int i = 0; i < 64; i++) {
x[i] = rand(gen);
y[i] = rand(gen);
}
state.ResumeTiming();
benchmark::DoNotOptimize(f(x, y));
state.PauseTiming();
}
delete[] x;
delete[] y;
}
BENCHMARK(f_benchmark)->Apply([](benchmark::internal::Benchmark* b){
for (int i = 0; i < 10; ++i)
b->Arg({ i });
});
BENCHMARK_MAIN();
Side note: Also take care about the the leaking memory in your for
loop - you should call the delete[]
operator once for every new[]
operator.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With