Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do my CRT-free applications intermittently crash on startup?

For example, this application:

#define _WIN32_WINNT 0x0500

#include <windows.h>

int __stdcall NoCRTMain(void) 
{
    int result;

    PWSTR lpCmdLine = GetCommandLine();

    for (;;)
    {
        if (*lpCmdLine == L'"') 
        {
            lpCmdLine++;
            for (;;)
            {
                if (*lpCmdLine == L'"') break;
                if (*lpCmdLine == L'\0') break;
                lpCmdLine++;
            }
        }
        if (*lpCmdLine == L' ') break;
        if (*lpCmdLine == L'\0') break;
        lpCmdLine++;
    }

    while (*lpCmdLine == ' ') lpCmdLine++;

    result = MessageBox(NULL, lpCmdLine, L"Scripting Engine", MB_OK | MB_SYSTEMMODAL);

    if (result != IDOK) for (;;) Sleep(INFINITE);

    ExitProcess(0);
}

Built in Visual Studio 2010 as a 64-bit application with no C runtime, this works perfectly - most of the time. Sometimes it will start crashing on startup for no readily apparent reason. This crash occurs before any of the code in the application is run (see below).

When the problem occurs, it occurs only for a particular instance of the executable, i.e., a particular executable file. An byte-by-byte identical copy of the file will run normally. The problem may (will?) disappear when the computer is rebooted. I am running tests to try to reliably reproduce the issue, so that I can identify the circumstances under which the problem can occur, e.g., only if Visual Studio is installed? Only if anti-virus is installed? But so far I haven't had much success in reproducing the problem by any procedure other than dumb luck.

Most of the time, debugging shows that kernel32!BaseThreadInitThunk is calling an invalid address instead of the the address of NoCRTMain, although some recent runs have failed earlier than that, apparently while loading DLLs.

I believe I have tracked the issue down to ImageBase being set incorrectly when the module is loaded. On a working instance, a memory dump at 0x00D8 relative to the executable module, structure _IMAGE_OPTIONAL_HEADER64 from winnt.h:

00000001`3f5900d8 0b 02 0a 00 00 02 00 00 00 06 00 00 00 00 00 00 
00000001`3f5900e8 00 10 00 00 00 10 00 00 00 00 59 3f 01 00 00 00 

shows that ImageBase (the last eight bytes) contains the address of the start of the module, in this case 1`3f590000. On a failing instance, the same memory dump

00000001`3fc600d8 0b 02 0a 00 00 02 00 00 00 06 00 00 00 00 00 00 
00000001`3fc600e8 00 10 00 00 00 10 00 00 00 00 8f 3f 01 00 00 00  

shows that ImageBase, rather than being 1`3fc60000 as expected, is 1`3f8f0000.

This seems to occurs before the earliest point at which a debugger can examine the process, so I'm not sure how to proceed. Perhaps I need to do kernel debugging? I currently have a VMWare vSphere virtual machine exhibiting the problem, and I've got a snapshot I can revert to, so I can afford to experiment.

So:

  • does anyone know the cause of this behaviour, and more importantly, how to prevent it?

  • is my interpretation of the memory dump mistaken?

  • any debugging/troubleshooting suggestions?

Compiler options:

/Zi /nologo /W3 /WX- /O2 /Oi /GL /D "WIN32" /D "NDEBUG" /D "_WINDOWS"
/D "_UNICODE" /D "UNICODE" /Gm- /EHsc /MT /GS- /Gy /fp:precise /Zc:wchar_t
/Zc:forScope /Fp"x64\Release\sehalt.pch" /Fa"x64\Release\" /Fo"x64\Release\"
/Fd"x64\Release\vc100.pdb" /Gd /errorReport:queue 

Linker options:

/OUT:"C:\documents\code\w7lab-scripting\sehalt\x64\Release\sehalt.exe"
/INCREMENTAL:NO /NOLOGO "kernel32.lib" "user32.lib" "gdi32.lib"
"winspool.lib" "comdlg32.lib" "advapi32.lib" "shell32.lib" "ole32.lib"
"oleaut32.lib" "uuid.lib" "odbc32.lib" "odbccp32.lib" /NODEFAULTLIB
/MANIFEST /ManifestFile:"x64\Release\sehalt.exe.intermediate.manifest"
/ALLOWISOLATION /MANIFESTUAC:"level='asInvoker' uiAccess='false'" /DEBUG
/PDB:"C:\documents\code\w7lab-scripting\sehalt\x64\Release\sehalt.pdb"
/SUBSYSTEM:WINDOWS /OPT:REF /OPT:ICF 
/PGD:"C:\documents\code\w7lab-scripting\sehalt\x64\Release\sehalt.pgd"
/LTCG /TLBID:1 /ENTRY:"NoCRTMain" /DYNAMICBASE /NXCOMPAT
/MACHINE:X64 /ERRORREPORT:QUEUE 

PS: looking at which of my applications are known to have failed in this way and which aren't, I suspect that the problem only occurs for executables that are less than one page (4096 bytes) in size.

like image 490
Harry Johnston Avatar asked Jun 05 '15 05:06

Harry Johnston


1 Answers

A year on, I'm finally confident in claiming that Hans' suggestion has worked perfectly: if the application is built with the /DYNAMICBASE:NO and /FIXED:YES options, the problem does not occur.

like image 51
Harry Johnston Avatar answered Oct 15 '22 04:10

Harry Johnston