Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

VS2012 compiler Strange memory deallocation issues

I'm having a strange problem with the VS2012 compiler that doesn't seem to show up in GCC. The deallocation process ends up taking minutes rather than seconds. Does anyone have any input on this? Step debugging shows a noticeable hang at calls to RtlpCollectFreeBlocks(). I have this problem in both debug and release mode. I'm running Windows 7 32-bit, but I have the same problem on 64-bit 7.

#include "stdafx.h"
#include <iostream>
#include <stdint.h>
#include <cstdlib>

#define SIZE 500000

using namespace std;

typedef struct
{
    uint32_t* thing1;
}collection;

/*
 * VS2012 compiler used.
 * Scenarios: 
 *  1) Don't allocate thing1. Program runs poorly.
 *  2) Allocate thing1 but don't delete it. Program runs awesome.
 *  3) Allocate thing1 and delete it. Program runs poorly.
 * 
 * Debug or Release mode does not affect outcome. GCC's compiler is fine.
 */
int _tmain(int argc, _TCHAR* argv[])
{
    collection ** colArray = new collection*[SIZE];

    for(int i=0;i<SIZE;i++)
    {
        collection * mine = new collection;
        mine->thing1 = new uint32_t; // Allocating without freeing runs fine. Either A) don't allocate or B) allocate and delete to make it run slow.
        colArray[i] = mine;
    }

    cout<<"Done with assignment\n";

    for(int i=0;i<SIZE;i++)
    {
        delete(colArray[i]->thing1); // delete makes it run poorly.
        delete(colArray[i]);

        if(i > 0 && i%100000 == 0)
        {
            cout<<"100 thousand deleted\n";
        }
    }
    delete [] colArray;

    cout << "Done!\n";
    int x;
    cin>>x;
}
like image 849
Sean Avatar asked Sep 11 '13 15:09

Sean


1 Answers

The performance hit you're seeing is coming from Windows debug heap functionality, and its a little stealthy in how it enables itself, even in release builds.

I took the liberty of build a 64bit debug image of a simpler program and came to discover this:

  • msvcr110d.dll!_CrtIsValidHeapPointer(const void * pUserData=0x0000000001a8b540)
  • msvcr110d.dll!_free_dbg_nolock(void * pUserData=0x0000000001a8b540, int nBlockUse=1)
  • msvcr110d.dll!_free_dbg(void * pUserData=0x0000000001a8b540, int nBlockUse=1)
  • msvcr110d.dll!operator delete(void * pUserData=0x0000000001a8b540)

Of particular interest to me was the body of msvcr110d.dll!_CrtIsValidHeapPointer which it turns out is this:

if (!pUserData)
    return FALSE;

// Note: all this does is checks for null    
if (!_CrtIsValidPointer(pHdr(pUserData), sizeof(_CrtMemBlockHeader), FALSE))
    return FALSE;

// but this is e-x-p-e-n-s-i-v-e
return HeapValidate( _crtheap, 0, pHdr(pUserData) );

That HeapValidate() call is brutal.

Ok, maybe I would expect this in a debug build. but certainly not release. As it turns out, that gets better, but look at the call stack:

  • ntdll.dll!RtlDebugFreeHeap()
  • ntdll.dll!string "Enabling heap debug options\n"()
  • ntdll.dll!RtlFreeHeap()
  • kernel32.dll!HeapFree()
  • msvcr110.dll!free(void * pBlock)

This is interesting, because when I ran this first, then attach to the running process with the IDE (or WinDbg) without allowing it to control the execution startup environment, this callstack stops at ntdll.dll!RtlFreeHeap(). In other words, running outside the IDE RtlDebugFreeHeap is not invoked. But why??

I thought to myself, somehow the debugger is flipping a switch to enable heap debugging. After doing some digging I came to find that "switch" is the debugger itself. Windows uses the special debug heap functions (RtlDebugAllocHeap and RtlDebugFreeHeap) if the process being run is spawned by a debugger. This man-page from MSDN on WinDbg eludes to this, along with other interesting tidbits about debugging under Windows:

from Debugging a User-Mode Process Using WinDbg

Processes that the debugger creates (also known as spawned processes) behave slightly differently than processes that the debugger does not create.

Instead of using the standard heap API, processes that the debugger creates use a special debug heap. You can force a spawned process to use the standard heap instead of the debug heap by using the _NO_DEBUG_HEAP environment variable or the -hd command-line option.

Now we're getting somewhere. To test this out I simply dropped a sleep() with an appropriate amount of time for me to attach the debugger rather than spawn the process with it, then let it run on its merry way. Sure enough, as mentioned previously, it sailed full-speed-ahead.

Based on the content of that article, I have taken liberty to update my Release-mode builds to define _NO_DEBUG_HEAP=1 in their execution environment settings of my project files. I'm obviously still interested in granular heap-activity in debug builds, so those configurations stayed as-is. After doing this, the overall speed of my release builds running under VS2012 (and VS2010) were substantially faster, and I invite you to try as well.

like image 81
WhozCraig Avatar answered Sep 23 '22 08:09

WhozCraig