Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ code runs faster when using Debug runtime library in Visual Studio 2013 [duplicate]

why the release version memset is slower than debug version in visual studio 2012? in visual sutido 2010, it is that result too. my computer:

Intel Core i7-3770 3.40GHz 8G memory os: windows 7 sp1 64bit

this is my test code:

#include <boost/progress.hpp>

int main()
{
    const int Size = 1000*1024*1024;
    char* Data = (char*)malloc(Size);

#ifdef _DEBUG
    printf_s("debug\n");
#else
    printf_s("release\n");
#endif

    boost::progress_timer timer;
    memset(Data, 0, Size);

    return 0;
}

the output:

release
0.27 s

debug
0.06 s

edited:

if i change code to this, it will get the same result:

#include <boost/progress.hpp>

int main()
{
    const int Size = 1000*1024*1024;
    char* Data = (char*)malloc(Size);
    memset(Data, 1, Size);

#ifdef _DEBUG
    printf_s("debug\n");
#else
    printf_s("release\n");
#endif

    {
        boost::progress_timer timer;
        memset(Data, 0, Size);
    }    

    return 0;
}

so Hans Passant is right, thank you very much.

like image 649
boo Avatar asked Jun 05 '14 10:06

boo


People also ask

Does code run slower in debugger?

Unfortunately, debugger speed has some runtime limitations, which can't be easily fixed. If your code does some high performance computations, Debugger will be at least 3 times slower than usual Run.

Is release build faster than debug?

Debug Mode: In debug mode the application will be slow. Release Mode: In release mode the application will be faster. Debug Mode: In the debug mode code, which is under the debug, symbols will be executed. Release Mode: In release mode code, which is under the debug, symbols will not be executed.

Why is debug mode slower than release?

In Debug mode, there are no optimizations, which means debug builds can run slower than Release build.


1 Answers

This is a standard benchmark mistake, you don't measure the execution time of memset() at all. You actually measure the time needed for the operating system to deal with the quarter of a million page faults that your code generates. Which is highly dependent on what other processes are running and how many pages were prepped by the kernel's zero page thread.

On a demand-page virtual memory operating system like Windows, malloc() doesn't allocate memory at all. It allocates address space. Just numbers to the processor. The physical memory allocation doesn't happen until the processor accesses the address space. At which point the kernel is forced to provide the physical RAM to allow the processor to continue. Triggered by a soft page fault generated by the processor when it discovers that an address isn't mapped to RAM yet.

If you want to have an estimate of how long memset() really takes then you have to call it twice. The first call ensures that the RAM is mapped. Time the second call to measure how long the memory writes take. Which is a fixed number for large memory ranges like you are using, the memory cache and write-back buffers are ineffective so speed is entirely determined by the bandwidth of the memory bus. Your debug result suggests DDR3 clocked at 266 MHz, pretty common.

This also removes the bias you get from using the debug allocator in the debug build of the CRT. Which fills allocated memory with a bit-pattern that's likely to induce a crash when you try to access uninitialized memory. This hides the page fault overhead since you didn't include the cost of malloc() in the measurement.

like image 198
Hans Passant Avatar answered Oct 22 '22 00:10

Hans Passant