I made a small test to check the performance of global function/functor/lambda as comparator parameters for std::sort
function. Functor and lambda give the same performance. I was surprised to see, that global function, which appears to be the simplest callback, is much slower.
#include <stdafx.h>
#include <windows.h>
#include <iostream>
#include <stdlib.h>
#include <time.h>
#include <vector>
#include <string>
#include <sstream>
#include <algorithm>
using namespace std;
const int vector_size = 100000;
bool CompareFunction(const string& s1, const string& s2)
{
return s1[0] < s2[0]; // I know that is crashes on empty string, but this is not the point here
}
struct CompareFunctor
{
bool operator() (const string& s1, const string& s2)
{
return s1[0] < s2[0];
}
} compareFunctor;
int main()
{
srand ((unsigned int)time(NULL));
vector<string> v(vector_size);
for(size_t i = 0; i < vector_size; ++i)
{
ostringstream s;
s << rand();
v[i] = s.str().c_str();
}
LARGE_INTEGER freq;
LARGE_INTEGER beginTime, endTime;
QueryPerformanceFrequency(&freq);
QueryPerformanceCounter(&beginTime);
// One of three following lines should be uncommented
sort(v.begin(), v.end(), CompareFunction);
// sort(v.begin(), v.end(), compareFunctor);
// sort(v.begin(), v.end(), [](const string& s1, const string& s2){return s1[0] < s2[0];});
QueryPerformanceCounter(&endTime);
float f = (endTime.QuadPart - beginTime.QuadPart) * 1000.0f/freq.QuadPart; // time in ms
cout << f << endl;
return 0;
}
A bit of Windows-specific code is used for precise execution time measurement. Environment: Windows 7, Visual C++ 2010. Of course, Release configuration with default optimizations turned on. Execution time:
Global function 2.6 - 3.6 ms (???)
Functor - 1.7 - 2.4 ms
Lambda - 1.7 - 2.4 ms
So, why the global function is slower? Some problem with VC++ compiler, or something else?
In general, std::sort is indeed faster than qsort because of a couple of these things: qsort operates on void* , which first requires a dereference, and second requires the size of the data type to perform the swaps.
The Functor is self-documenting by default; but Lambda's need to be stored in variables (to be self-documenting) inside more-complex algorithm definitions.
Lambda expressions can be implemented as functors. If no parameters and no explicit return type, then () can be omitted.
the lambda and functor versions are in lined effectively eliminating the pushing and popping of arguments for every compare.
Try using
inline bool CompareFunction(const string& s1, const string& s2)
{
return s1[0] < s2[0]; // I know that is crashes on empty string, but this is not the point here
}
and see if it makes a difference. Note that automatic inlining by compilers will vary a lot depending on the compiler, build version etc. I would be surprised that the compiler doesn't automatically inline your global function - unless you're actually compiling in debug mode - which you shouldn't be doing for a performance test case. To really test whether inlining is the issue, you should divide your test into two files and compile them separately
replace
bool CompareFunction(const string& s1, const string& s2){
return s1[0] < s2[0]; // I know that is crashes on empty string, but this is not the point here
}
with
bool CompareFunction(const string& s1, const string& s2);
and put the definition in a separate file - say compare.cpp
While you're at it, you could frustrate inlining for functors as well by using:
struct CompareFunctor
{
bool operator() (const string& s1, const string& s2);
} compareFunctor;
and putting in a separate file
bool CompareFunctor::operator() (const string& s1, const string& s2)
{
return s1[0] < s2[0];
}
Passing a global function is the most complex, not the simplest.
When you pass in a function you are in fact passing in a pointer to the function so the sort function can't easily inline the call to the function as it doesn't know at compile time what the pointer will point to. Sure, it may be able to figure out that the call through the function pointer calls the same function every time and inline it all, but that's difficult.
When you use a lambda or functor, the compiler knows exactly which function it needs to call when it is generating the code so it is very much likely to be able to inline it all.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With