For a long time I’ve noticed that the Win64 version of my server application leak memory. While the Win32 version works fine with a relatively stable memory footprint, the memory used by the 64 bit version increases regularly – maybe 20Mb/day, without any apparent reason (Needless to say, FastMM4 did not report any memory leak for both of them). The source code is identical between the 32bit and the 64bit version. The application is built around the Indy TIdTCPServer component, it is a highly multithreaded server connected to a database that processes commands sent by other clients made with Delphi XE2.
I spend a lot of time reviewing my own code and trying to understand why the 64 bit version leaked so much memory. I ended up by using MS tools designed to track memory leaks like DebugDiag and XPerf and it seems there is a fundamental flaw in the Delphi 64bit RTL that causes some bytes to be leaked each time a thread has detached from a DLL. This issue is particularly critical for highly multithreaded applications that must run 24/7 without being restarted.
I reproduced the problem with a very basic project that is composed by an host application and a library, both built with XE2. The DLL is statically linked with the host app. The host app creates threads that just call the dummy exported procedure and exit:
Here is the source code of the library:
library FooBarDLL; uses Windows, System.SysUtils, System.Classes; {$R *.res} function FooBarProc(): Boolean; stdcall; begin Result := True; //Do nothing. end; exports FooBarProc;
The host application uses a timer to create a thread that just call the exported procedure:
TFooThread = class (TThread) protected procedure Execute; override; public constructor Create; end; ... function FooBarProc(): Boolean; stdcall; external 'FooBarDll.dll'; implementation {$R *.dfm} procedure THostAppForm.TimerTimer(Sender: TObject); begin with TFooThread.Create() do Start; end; { TFooThread } constructor TFooThread.Create; begin inherited Create(True); FreeOnTerminate := True; end; procedure TFooThread.Execute; begin /// Call the exported procedure. FooBarProc(); end;
Here is some screenshots that show the leak using VMMap (look at the red line named "Heap"). The following screenshots were taken within a 30 minutes interval.
The 32 bit binary shows an increase of 16 bytes, which is totally acceptable:
The 64 bit binary shows an increase of 12476 bytes (from 820K to 13296K), which is more problematic:
The constant increase of heap memory is also confirmed by XPerf:
XPerf usage
Using DebugDiag I was able to see the code path that was allocating the leaked memory:
LeakTrack+13529 <my dll>!Sysinit::AllocTlsBuffer+13 <my dll>!Sysinit::InitThreadTLS+2b <my dll>!Sysinit::::GetTls+22 <my dll>!System::AllocateRaiseFrame+e <my dll>!System::DelphiExceptionHandler+342 ntdll!RtlpExecuteHandlerForException+d ntdll!RtlDispatchException+45a ntdll!KiUserExceptionDispatch+2e KERNELBASE!RaiseException+39 <my dll>!System::::RaiseAtExcept+106 <my dll>!System::::RaiseExcept+1c <my dll>!System::ExitDll+3e <my dll>!System::::Halt0+54 <my dll>!System::::StartLib+123 <my dll>!Sysinit::::InitLib+92 <my dll>!Smart::initialization+38 ntdll!LdrShutdownThread+155 ntdll!RtlExitUserThread+38 <my application>!System::EndThread+20 <my application>!System::Classes::ThreadProc+9a <my application>!SystemThreadWrapper+36 kernel32!BaseThreadInitThunk+d ntdll!RtlUserThreadStart+1d
Remy Lebeau helped me on the Embarcadero forums to understand what was happening:
The second leak looks more like a definite bug. During thread shutdown, StartLib() is being called, which calls ExitThreadTLS() to free the calling thread's TLS memory block, then calls Halt0() to call ExitDll() to raise an exception that is caught by DelphiExceptionHandler() to call AllocateRaiseFrame(), which indirectly calls GetTls() and thus InitThreadTLS() when it accesses a threadvar variable named ExceptionObjectCount. That re-allocates the TLS memory block of the calling thread that is still in the process of being shut down. So either StartLib() should not be calling Halt0() during DLL_THREAD_DETACH, or DelphiExceptionHandler should not be calling AllocateRaiseFrame() when it detects a _TExitDllException being raised.
It seems clear for me that there is an major flaw in the Win64 way to handle threads shutdown. A such behavior prohibits the development of any multithreaded server application that must run 27/7 under Win64.
So:
QC Report 105559
Always join the joinable threads; by not joining them, you risk serious memory leaks. For example, a thread on Red Hat Enterprise Linux (RHEL4), needs a 10MB stack, which means at least 10MB is leaked if you haven't joined it. Say you design a manager-worker mode program to process incoming requests.
A memory leak reduces the performance of the computer by reducing the amount of available memory. Eventually, in the worst case, too much of the available memory may become allocated and all or part of the system or device stops working correctly, the application fails, or the system slows down vastly due to thrashing.
A very simple work around is to re-use the thread and not create and destroy them. Threads are pretty expensive, you'll probably get a perf boost too... Kudos on the debugging though...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With