Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Destruction of Native Objects with Static Storage Duration

2012-12-09 Summary:

  • In a normal mixed-mode application global native C++ destructors run as finalizers. It's not possible to change that behavior or the associated timeout.
  • A mixed-mode assembly DLL runs C++ constructors/destructors during DLL load/unload - exactly as a native DLL.
  • Hosting the CLR in a native executable using the COM interface allows both the deconstructors to behave as in a native DLL (the behavior I desire) and setting the timeout for finalizers (an added bonus).
  • As far as I can tell the above applies to at least Visual Studio 2008, 2010 and 2012. (Only tested with .NET 4)

The actual CLR hosting executable I plan on using is very similar to the one outlined in this question except for a few minor changes:

  • Setting OPR_FinalizerRun to some value (60 seconds currently, but subject to change) as suggested by Hans Passant.
  • Using the ATL COM smart pointer classes (these aren't available in the express editions of Visual Studio, so I omitted them from this post).
  • Lodaing CLRCreateInstance from mscoree.dll dynamically (to allow better error messages when no compatible CLR is installed).
  • Passing the command line on from the host to the designated Main function in the assembly DLL.

Thanks to all who took the time to read the question and/or comment.


2012-12-02 Update at the bottom of the post.

I'm working on a mixed mode C++/CLI application using Visual Studio 2012 with .NET 4 and was surprised to discover that the destructors for some of the native global objects weren't getting called. Investigating the issue it turns out that they behave like managed objects as explained in this post.

I was quite surprised by this behavior (I understand it for managed objects) and couldn't find it documented anywhere, neither in the C++/CLI standard nor in the description of destructors and finalizers.

Following the suggestion in a comment by Hans Passant, I compiled the programs as an assembly DLL and hosted it in a small native executable and that does give me the desired behavior (destructors given ample time to finish and running in the same thread as they were constructed)!

My questions:

  1. Can I get the same behavior in an stand alone executable?
  2. If (1) isn't feasible is it possible to configure the process timeout policy (i.e. basically calling ICLRPolicyManager->SetTimeout(OPR_ProcessExit, INFINITE)) for the executable? This would be an acceptable workaround.
  3. Where is this documented / how can I educate myself more on the topic? I'd rather not rely on behavior that's liable to change.

To reproduce compile the below files as follows:

cl /EHa /MDd CLRHost.cpp
cl /EHa /MDd /c Native.cpp
cl /EHa /MDd /c /clr CLR.cpp
link /out:CLR.exe Native.obj CLR.obj 
link /out:CLR.dll /DLL Native.obj CLR.obj 

Unwanted behavior:

C:\Temp\clrhost>clr.exe
[1210] Global::Global()
[d10] Global::~Global()

C:\Temp\clrhost>

Running hosted:

C:\Temp\clrhost>CLRHost.exe clr.dll
[1298] Global::Global()
2a returned.
[1298] Global::~Global()
[1298] Global::~Global() - Done!

C:\Temp\clrhost>

Used files:

// CLR.cpp
public ref class T {
    static int M(System::String^ arg) { return 42; }
};
int main() {}

// Native.cpp
#include <windows.h>
#include <iostream>
#include <iomanip>
using namespace std;
struct Global {
    Global() {
        wcout << L"[" << hex << GetCurrentThreadId() << L"] Global::Global()" << endl;
    }
    ~Global() {
        wcout << L"[" << hex << GetCurrentThreadId() << L"] Global::~Global()" << endl;
        Sleep(3000);
        wcout << L"[" << hex << GetCurrentThreadId() << L"] Global::~Global() - Done!" << endl;
    }
} g;

// CLRHost.cpp
#include <windows.h>
#include <metahost.h>
#pragma comment(lib, "mscoree.lib")

#include <iostream>
#include <iomanip>
using namespace std;

int wmain(int argc, const wchar_t* argv[])
{
    HRESULT hr = S_OK;
    ICLRMetaHost* pMetaHost = 0;
    ICLRRuntimeInfo* pRuntimeInfo = 0;
    ICLRRuntimeHost* pRuntimeHost = 0;
    wchar_t version[MAX_PATH];
    DWORD versionSize = _countof(version);

    if (argc < 2) { 
        wcout << L"Usage: " << argv[0] << L" <assembly.dll>" << endl;
        return 0;
    }

    if (FAILED(hr = CLRCreateInstance(CLSID_CLRMetaHost, IID_PPV_ARGS(&pMetaHost)))) {
        goto out;
    }

    if (FAILED(hr = pMetaHost->GetVersionFromFile(argv[1], version, &versionSize))) {
        goto out;
    }

    if (FAILED(hr = pMetaHost->GetRuntime(version, IID_PPV_ARGS(&pRuntimeInfo)))) {
        goto out;
    }

    if (FAILED(hr = pRuntimeInfo->GetInterface(CLSID_CLRRuntimeHost, IID_PPV_ARGS(&pRuntimeHost)))) {
        goto out;
    }

    if (FAILED(hr = pRuntimeHost->Start())) {
        goto out;
    }

    DWORD dwRetVal = E_NOTIMPL;
    if (FAILED(hr = pRuntimeHost->ExecuteInDefaultAppDomain(argv[1], L"T", L"M", L"", &dwRetVal))) {
        wcerr << hex << hr << endl;
        goto out;
    }

    wcout << dwRetVal << " returned." << endl;

    if (FAILED(hr = pRuntimeHost->Stop())) {
        goto out;
    }

out:
    if (pRuntimeHost) pRuntimeHost->Release();
    if (pRuntimeInfo) pRuntimeInfo->Release();
    if (pMetaHost) pMetaHost->Release();

    return hr;
}

2012-12-02:
As far as I can tell the behavior seems to be as follows:

  • In a mixed-mode EXE file, global destructors are run as finalizers during DomainUnload regardless of whether they are placed in native code or CLR code. This is the case in Visual Studio 2008, 2010 and 2012.
  • In a mixed-mode DLL hosted by a native application destructors for global native objects are run during DLL_PROCESS_DETACH after the managed method has run and all other clean up has occurred. They run in the same thread as the constructor and there is no timeout associated with them (the desired behavior). As expected the time destructors of global managed objects (non-ref classes placed in files compiled with /clr) can be controlled using ICLRPolicyManager->SetTimeout(OPR_ProcessExit, <timeout>).

Hazarding a guess, I think the reason global native constructors/destructors function "normally" (defined as behaving as I would expect) in the DLL scenario is to allow using LoadLibrary and GetProcAddress on native functions. I would thus expect that it is relatively safe to rely on it not changing in the foreseeable future, but would appreciate having some kind of confirmation/denial from official sources/documentation either way.

Update 2:

In Visual Studio 2012 (tested with the express and premium versions, I unfortunately don't have access to earlier versions on this machine). It should work the same way on the command line (building as outlined above), but here's how to reproduce from within the IDE.

Building CLRHost.exe:

  1. File -> New Project
  2. Visual C++ -> Win32 -> Win32 Console Application (Name the project "CLRHost")
  3. Application Settings -> Additional Options -> Empty project
  4. Press "Finish"
  5. Right click on Source Files in the solution explorer. Add -> New Item -> Visual C++ -> C++ File. Name it CLRHost.cpp and paste the content of CLRHost.cpp from the post.
  6. Project -> Properties. Configuration Properties -> C/C++ -> Code Generation -> Change "Enable C++ Exceptions" to "Yes with SEH Exceptions (/EHa)" and "Basic Runtime Checks" to "Default"
  7. Build.

Building CLR.DLL:

  1. File -> New Project
  2. Visual C++ -> CLR -> Class Library (Name the project "CLR")
  3. Delete all the autogenerated files
  4. Project -> Properties. Configuration Properties -> C/C++ -> Precompiled headers -> Prepcompiled headers. Change to "Not Using Precompiled Headers".
  5. Right click on Source Files in the solution explorer. Add -> New Item -> Visual C++ -> C++ File. Name it CLR.cpp and paste the content of CLR.cpp from the post.
  6. Add a new C++ file named Native.cpp and paste the code from the post.
  7. Right click on "Native.cpp" in the solution explorer and select properties. Change C/C++ -> General -> Common Language RunTime Support to "No Common Language RunTime Support"
  8. Project -> Properties -> Debugging. Change "Command" to point to CLRhost.exe, "Command Arguments" to "$(TargetPath)" including the quotes, "Debugger Type" to "Mixed"
  9. Build and debug.

Placing a breakpoint in the destructor of Global gives the following stack trace:

>   clr.dll!Global::~Global()  Line 11  C++
    clr.dll!`dynamic atexit destructor for 'g''()  + 0xd bytes  C++
    clr.dll!_CRT_INIT(void * hDllHandle, unsigned long dwReason, void * lpreserved)  Line 416   C
    clr.dll!__DllMainCRTStartup(void * hDllHandle, unsigned long dwReason, void * lpreserved)  Line 522 + 0x11 bytes    C
    clr.dll!_DllMainCRTStartup(void * hDllHandle, unsigned long dwReason, void * lpreserved)  Line 472 + 0x11 bytes C
    mscoreei.dll!__CorDllMain@12()  + 0x136 bytes   
    mscoree.dll!_ShellShim__CorDllMain@12()  + 0xad bytes   
    ntdll.dll!_LdrpCallInitRoutine@16()  + 0x14 bytes   
    ntdll.dll!_LdrShutdownProcess@0()  + 0x141 bytes    
    ntdll.dll!_RtlExitUserProcess@4()  + 0x74 bytes 
    kernel32.dll!74e37a0d()     
    mscoreei.dll!RuntimeDesc::ShutdownAllActiveRuntimes()  + 0x10e bytes    
    mscoreei.dll!_CorExitProcess@4()  + 0x27 bytes  
    mscoree.dll!_ShellShim_CorExitProcess@4()  + 0x94 bytes 
    msvcr110d.dll!___crtCorExitProcess()  + 0x3a bytes  
    msvcr110d.dll!___crtExitProcess()  + 0xc bytes  
    msvcr110d.dll!__unlockexit()  + 0x27b bytes 
    msvcr110d.dll!_exit()  + 0x10 bytes 
    CLRHost.exe!__tmainCRTStartup()  Line 549   C
    CLRHost.exe!wmainCRTStartup()  Line 377 C
    kernel32.dll!@BaseThreadInitThunk@12()  + 0x12 bytes    
    ntdll.dll!___RtlUserThreadStart@8()  + 0x27 bytes   
    ntdll.dll!__RtlUserThreadStart@8()  + 0x1b bytes    

Running as a standalone executable I get a stack trace that is very similar to the one observed by Hans Passant (though it isn't using the managed version of the CRT):

>   clrexe.exe!Global::~Global()  Line 10   C++
    clrexe.exe!`dynamic atexit destructor for 'g''()  + 0xd bytes   C++
    msvcr110d.dll!__unlockexit()  + 0x1d3 bytes 
    msvcr110d.dll!__cexit()  + 0xe bytes    
    [Managed to Native Transition]  
    clrexe.exe!<CrtImplementationDetails>::LanguageSupport::_UninitializeDefaultDomain(void* cookie) Line 577   C++
    clrexe.exe!<CrtImplementationDetails>::LanguageSupport::UninitializeDefaultDomain() Line 594 + 0x8 bytes    C++
    clrexe.exe!<CrtImplementationDetails>::LanguageSupport::DomainUnload(System::Object^ source, System::EventArgs^ arguments) Line 628 C++
    clrexe.exe!<CrtImplementationDetails>::ModuleUninitializer::SingletonDomainUnload(System::Object^ source, System::EventArgs^ arguments) Line 273 + 0x6e bytes   C++
    kernel32.dll!@BaseThreadInitThunk@12()  + 0x12 bytes    
    ntdll.dll!___RtlUserThreadStart@8()  + 0x27 bytes   
    ntdll.dll!__RtlUserThreadStart@8()  + 0x1b bytes    
like image 905
user786653 Avatar asked Nov 29 '12 18:11

user786653


2 Answers

Getting the easy questions out of the way first:

A good resource for CLR customization is Steven Pratschner's book "Customizing the Microsoft .NET Framework Common Language Runtime". Beware that it is outdated, the hosting interfaces have changed in .NET 4.0. MSDN doesn't say much about it but the hosting interfaces are well documented.

You can make debugging simpler by changing a debugger setting, change the Type from "Auto" to "Managed" or "Mixed".

Do note that your 3000 msec sleep is just on the edge, you should test with 5000 msec. If the C++ class appears in code that's compiled with /clr in effect, even with #pragma unmanaged in effect then you'll need to override the finalizer thread timeout. Tested on the .NET 3.5 SP1 CLR version, the following code worked well to give the destructor sufficient time to run to completion:

ICLRControl* pControl;
if (FAILED(hr = pRuntimeHost->GetCLRControl(&pControl))) {
    goto out;
}
ICLRPolicyManager* pPolicy;
if (FAILED(hr = pControl->GetCLRManager(__uuidof(ICLRPolicyManager), (void**)&pPolicy))) {
    goto out;
}
hr = pPolicy->SetTimeout(OPR_FinalizerRun, 60000);
pPolicy->Release();
pControl->Release();

I picked a minute as a reasonable time, tweak as necessary. Note that the MSDN documentation has a bug, it doesn't show OPR_FinalizerRun as a permitted value but it does in fact work properly. Setting the finalizer thread timeout also ensures that a managed finalizer won't time out when it indirectly destructs an unmanaged C++ class, a very common scenario.

One thing you'll see when you run this code with CLRHost compiled with /clr is that the call to GetCLRManager() will fail with an HOST_E_INVALIDOPERATION return code. The default CLR host that got loaded to execute your CLRHost.exe won't let you override the policy. So you are pretty stuck with having a dedicated EXE to host the CLR.

When I tested this by having CLRHost load a mixed-mode assembly, the call stack looked like this when setting a breakpoint on the destructor:

CLRClient.dll!Global::~Global()  Line 24    C++
[Managed to Native Transition]  
CLRClient.dll!<Module>.?A0x789967ab.??__Fg@@YMXXZ() + 0x1b bytes    
CLRClient.dll!_exit_callback() Line 449 C++
CLRClient.dll!<CrtImplementationDetails>::LanguageSupport::_UninitializeDefaultDomain(void* cookie = <undefined value>) Line 753    C++
CLRClient.dll!<CrtImplementationDetails>::LanguageSupport::UninitializeDefaultDomain() Line 775 + 0x8 bytes C++
CLRClient.dll!<CrtImplementationDetails>::LanguageSupport::DomainUnload(System::Object^ source = 0x027e1274, System::EventArgs^ arguments = <undefined value>) Line 808 C++
msvcm90d.dll!<CrtImplementationDetails>.ModuleUninitializer.SingletonDomainUnload(object source = {System.AppDomain}, System.EventArgs arguments = null) + 0xa1 bytes
    // Rest omitted

Do note that this is unlike your observations in your question. The code is triggered by the managed version of the CRT (msvcm90.dll). And this code runs on a dedicated thread, started by the CLR to unload an appdomain. You can see the source code for this in the vc/crt/src/mstartup.cpp source code file.


The second scenario occurs when the C++ class is part of a source code file that is compiled without /clr in effect and got linked into the mixed-mode assembly. The compiler then uses the normal atexit() handler to call the destructor, just like it normally does in an unmanaged executable. In this case when the DLL gets unloaded by Windows at program termination and the managed version of the CRT shuts down.

Notable is that this happens after the CLR is shutdown and that the destructor runs on the program's startup thread. Accordingly, the CLR timeouts are out of the picture and the destructor can take as long as it wants. The essence of the stack trace is now:

CLRClient.dll!Global::~Global()  Line 12    C++
CLRClient.dll!`dynamic atexit destructor for 'g''()  + 0xd bytes    C++
    // Confusingly named functions elided
    //...
CLRHost.exe!__crtExitProcess(int status=0x00000000)  Line 732   C
CLRHost.exe!doexit(int code=0x00000000, int quick=0x00000000, int retcaller=0x00000000)  Line 644 + 0x9 bytes   C
CLRHost.exe!exit(int code=0x00000000)  Line 412 + 0xd bytes C
    // etc..

This is however a corner case that will only occur when the startup EXE is unmanaged. As soon as the EXE is managed, it will run destructors on AppDomain.Unload, even if they appear in code that was compiled without /clr. So you still have the timeout problem. Having an unmanaged EXE is not very unusual, this will happen for example when you load [ComVisible] managed code. But that doesn't sound like your scenario, you are stuck with CLRHost.

like image 75
Hans Passant Avatar answered Nov 13 '22 05:11

Hans Passant


To answer the "Where is this documented / how can I educate myself more on the topic?" question: you can understand how this works (or used to work at least for the framework 2) if you download and check out the Shared Source Common Language Infrastructure (aka SSCLI) from here http://www.microsoft.com/en-us/download/details.aspx?id=4917.

Once you've extracted the files, you will find in gcEE.ccp ("garbage collection execution engine") this:

#define FINALIZER_TOTAL_WAIT 2000

wich defines this famous default value of 2 seconds. You will also in the same file see this:

BOOL GCHeap::FinalizerThreadWatchDogHelper()
{
    // code removed for brevity ...
    DWORD totalWaitTimeout;
    totalWaitTimeout = GetEEPolicy()->GetTimeout(OPR_FinalizerRun);
    if (totalWaitTimeout == (DWORD)-1)
    {
        totalWaitTimeout = FINALIZER_TOTAL_WAIT;
    }

That will tell you the Execution Engine will obey the OPR_FinalizerRun policy, if defined, which correspond to the value in the EClrOperation Enumeration. GetEEPolicy is defined in eePolicy.h & eePolicy.cpp.

like image 32
Simon Mourier Avatar answered Nov 13 '22 03:11

Simon Mourier