Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Performance of C++/CLI function pointers versus .NET delegates

For my C++/CLI project I just tried to measure the cost of C++/CLI function pointers versus .NET delegates.

My expectation was, that C++/CLI function pointers are faster than .NET delegates. So my test separately counts the number of invocations of the .NET delegate and native function pointer throughout 5 seconds.

Results

Now the results were (and still are) shocking to me:

  • .NET delegate: 910M executions with result 152080413333030 in 5003ms
  • Function pointer: 347M executions with result 57893422166551 in 5013ms

That means, the native C++/CLI function pointer usage is almost 3x slower than using a managed delegate from within C++/CLI code. How can that be? I should use managed constructs when it comes to using interfaces, delegates or abstract classes in performance-critical sections?

The test code

The function which gets called continuously:

__int64 DoIt(int n, __int64 sum)
{
    if ((n % 3) == 0)
        return sum + n;
    else
        return sum + 1;
}

The code, which invokes the method, tries to make use of all the parameters as well as the return value, so nothing gets optimized away (hopefully). Here's the code (for .NET delegates):

__int64 executions;
__int64 result;
System::Diagnostics::Stopwatch^ w = gcnew System::Diagnostics::Stopwatch();

System::Func<int, __int64, __int64>^ managedPtr = gcnew System::Func<int, __int64, __int64>(&DoIt);
w->Restart();
executions = 0;
result = 0;
while (w->ElapsedMilliseconds < 5000)
{
    for (int i=0; i < 1000000; i++)
        result += managedPtr(i, executions);
    executions++;
}
System::Console::WriteLine(".NET delegate:       {0}M executions with result {2} in {1}ms", executions, w->ElapsedMilliseconds, result);

Similar to the .NET delegate invocation, the C++ function pointer is used:

typedef __int64 (* DoItMethod)(int n, __int64 sum);

DoItMethod nativePtr = DoIt;
w->Restart();
executions = 0;
result = 0;
while (w->ElapsedMilliseconds < 5000)
{
    for (int i=0; i < 1000000; i++)
        result += nativePtr(i, executions);
    executions++;
}
System::Console::WriteLine("Function pointer:    {0}M executions with result {2} in {1}ms", executions, w->ElapsedMilliseconds, result);

Additional infos

  • Compiled with Visual Studio 2012
  • .NET Framework 4.5 was targeted
  • Release build (execution counts stay proportional for Debug builds)
  • Calling convention is __stdcall (__fastcall not allowed when the project gets compiled with CLR support)

All tests done:

  • .NET virtual method: 1025M executions with result 171358304166325 in 5004ms
  • .NET delegate: 910M executions with result 152080413333030 in 5003ms
  • Virtual method: 336M executions with result 56056335999888 in 5006ms
  • Function pointer: 347M executions with result 57893422166551 in 5013ms
  • Function call: 1459M executions with result 244230520832847 in 5001ms
  • Inlined function: 1385M executions with result 231791984166205 in 5000ms

The direct call to "DoIt" is represented here by "Function call", which seems to get inlined by the compiler, as there is no (significant) difference in execution counts compared to a call to the inlined function.

Calls to C++ virtual methods are as 'slow' as the function pointer. A virtual method of a managed class (ref class) is as fast as the .NET delegate.

Update: I digged a little deeper, and it seems that for the tests with unmanaged functions, the transition to native code happens each time the DoIt function gets called. Therefore I wrapped the inner loop into another function which I forced to compile unmanaged:

#pragma managed(push, off)
__int64 TestCall(__int64* executions)
{
    __int64 result = 0;
    for (int i=0; i < 1000000; i++)
            result += DoItNative(i, *executions);
    (*executions)++;
    return result;
}
#pragma managed(pop)

Additionally I tested std::function like that:

#pragma managed(push, off)
__int64 TestStdFunc(__int64* executions)
{
    __int64 result = 0;
    std::function<__int64(int, __int64)> func(DoItNative);
    for (int i=0; i < 1000000; i++)
        result += func(i, *executions);
    (*executions)++;
    return result;
}
#pragma managed(pop)

Now, the new results are:

  • Function call: 2946M executions with result 495340439997054 in 5000ms
  • std::function: 160M executions with result 26679519999840 in 5018ms

std::function is a bit disappointing.

like image 505
uebe Avatar asked Nov 18 '12 18:11

uebe


People also ask

Are function pointers slower than functions?

Using a function pointer is slower that just calling a function as it is another layer of indirection. (The pointer needs to be dereferenced to get the memory address of the function). While it is slower, compared to everything else your program may do (Read a file, write to the console) it is negligible.

What is the advantage of using function pointer?

1) Unlike normal pointers, a function pointer points to code, not data. Typically a function pointer stores the start of executable code. 2) Unlike normal pointers, we do not allocate de-allocate memory using function pointers. 3) A function's name can also be used to get functions' address.

Are delegates just function pointers?

Delegates in C# are similar to function pointers in C++, but C# delegates are type safe. You can pass methods as parameters to a delegate to allow the delegate to point to the method. Delegates are used to define callback methods and implement event handling, and they are declared using the “delegate” keyword.

What is the most useful application of function pointers?

Function pointers can be useful when you want to create callback mechanism, and need to pass address of a function to another function. They can also be useful when you want to store an array of functions, to call dynamically for example.


1 Answers

You are seeing the cost of "double thunking". The core problem with your DoIt() function is that it is being compiled as managed code. The delegate call is very fast, it is uncomplicated to go from managed to managed code through a delegate. The function pointer is slow however, the compiler automatically generates code to first switch from managed code to unmanaged code and make the call through the function pointer. Which then ends up in a stub that switches from unmanaged code back to managed code and calls DoIt().

Presumably what you really meant to measure was a call to native code. Use a #pragma to force DoIt() to be generated as machine code, like this:

#pragma managed(push, off)
__int64 DoIt(int n, __int64 sum)
{
    if ((n % 3) == 0)
        return sum + n;
    else
        return sum + 1;
}
#pragma managed(pop)

You'll now see that the function pointer is faster than a delegate

like image 70
Hans Passant Avatar answered Oct 06 '22 00:10

Hans Passant