I can't sleep! :)
I have a reasonably large project on Windows and encountered some heap corruption issues. I have read all SO, including this nice topic: How to debug heap corruption errors?, however nothing was suitable to help me out-of-the-box. Debug CRT
and BoundsChecker
detected heap corruptions, but addresses were always different and detections point were always far away from the actual memory overwrites. I have not slept till the middle of the night and crafted the following hack:
DWORD PageSize = 0;
inline void SetPageSize()
{
if ( !PageSize )
{
SYSTEM_INFO sysInfo;
GetSystemInfo(&sysInfo);
PageSize = sysInfo.dwPageSize;
}
}
void* operator new (size_t nSize)
{
SetPageSize();
size_t Extra = nSize % PageSize;
nSize = nSize + ( PageSize - Extra );
return Ptr = VirtualAlloc( 0, nSize, MEM_COMMIT, PAGE_READWRITE);
}
void operator delete (void* pPtr)
{
MEMORY_BASIC_INFORMATION mbi;
VirtualQuery(pPtr, &mbi, sizeof(mbi));
// leave pages in reserved state, but free the physical memory
VirtualFree(pPtr, 0, MEM_DECOMMIT);
DWORD OldProtect;
// protect the address space, so noone can access those pages
VirtualProtect(pPtr, mbi.RegionSize, PAGE_NOACCESS, &OldProtect);
}
Some heap corruption errors became obvious and i was able to fix them. There were no more Debug CRT warnings on exit. However, i have some questions regarding this hack:
1. Can it produce any false positives?
2. Can it miss some of the heap corruptions? (even if we replace malloc/realloc/free?)
3. It fails to run on 32-bits with OUT_OF_MEMORY
, only on 64-bits. Am I right we simply run out of the virtual address space on 32-bits?
Check for heap corruption Most memory corruption is actually due to heap corruption. Try using the Global Flags Utility (gflags.exe) or pageheap.exe. See /windows-hardware/drivers/debugger/gflags-and-pageheap.
To debug heap corruption, you must identify both the code that allocated the memory involved and the code that deleted, released, or overwrote it. If the symptom appears immediately, you can often diagnose the problem by examining code near where the error occurred.
Heap corruption occurs when a program damages the allocator's view of the heap. The outcome can be relatively benign and cause a memory leak (where some memory isn't returned to the heap and is inaccessible to the program afterward), or it may be fatal and cause a memory fault, usually within the allocator itself.
Can it produce any false positives?
So, this will only catch bugs of the class "use after free()". For that purpose, I think, it's reasonably good.
If you try to delete
something that wasn't new
'ed, that's a different type of bug. In delete
you should first check if the memory has been indeed allocated. You shouldn't be blindly freeing the memory and marking it as inaccessible. I'd try to avoid that and report (by, say, doing a debug break) when there's an attempt to delete
something that shouldn't be deleted because it was never new
'ed.
Can it miss some of the heap corruptions? (even if we replace malloc/realloc/free?)
Obviously, this won't catch all corruptions of heap data between new
and and the respective delete
. It will only catch those attempted after delete
.
E.g.:
myObj* = new MyObj(1,2,3);
// corruption of *myObj happens here and may go unnoticed
delete myObj;
It fails to run on 32-bit target with OUT_OF_MEMORY error, only on 64-bit. Am I right that we simply run out of the virtual address space on 32-bits?
Typically you have available about ~2GB of the virtual address space on a 32-bit Windows. That's good for at most ~524288 new
's like in the provided code. But with objects bigger than 4KB, you'll be able to successfully allocate fewer instances than that. And then address space fragmentation will reduce that number further.
It's a perfectly expected outcome if you create many object instances during the life cycle of your program.
This won't catch:
Ideally, you should write a well-known bit pattern before and after your allocated blocks, so that operator delete
can check whether they were overwritten (indicated buffer over- or under-run).
Currently this would be allowed silently in your scheme, and switching back to malloc
etc. would allow it to silently damage the heap, and show up as an error later on (eg. when freeing the block after the over-run one).
You can't catch everything though: note for example that if the underlying problem is (valid) pointer somewhere getting overwritten with garbage, you can't detect this until the damaged pointer is de-referenced.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With