I have been chasing down what appears to be a memory leak in a DLL built in Delphi 2007 for Win32. The memory for the threadvar variables is not freed if the threads still exist when the DLL is unloaded (there are no active calls into the DLL when it is unloaded).
The question: Is there some way to cause Delphi to free memory associated with threadvar variables? It is not as simple as just not using them. A number of the existing Delphi components use them, so even if the DLL does not explicitly declare them, it ends up using them.
A Few Details
I have tracked it down to a LocalAlloc call that occurs in response to the usage of a threadvar variable, which is Delphi's "wrapper" around thread-local storage in Win32. For the curious, the allocation call is in the Delphi source file sysinit.pas. The corresponding LocalFree call occurs only for threads that get DLL_THREAD_DETACH
calls. If you have multiple threads in an application and unload a DLL, there is no DLL_THREAD_DETACH
call for each thread. The DLL gets a DLL_PROCESS_DETACH
and nothing else; I believe that is expected and valid. Thus, any thread-local storage allocations made on other threads are leaked.
I re-created it with a short C program that starts several "worker" threads. It loads the DLL (via LoadLibrary) on the main thread and then makes calls into an exported function on the worker threads. The function exported from the Delphi DLL assigns a value to a threadvar integer variable and returns. The C program then unloads the DLL (via FreeLibrary on the main thread) and repeats. After about 32,000 iterations, the process memory usage shown in Process Explorer grows to over 130MB. I also verified it more accurately with umdh. UMDH showed 24 bytes lost per instance. But the 130MB in Process Explorer seems to indicate about 4K per iteration; I'm guessing a 4K segment was leaked each time based on that, but I don't know for sure.
For clarification, here is the threadvar declaration and the entire exported function:
threadvar
threadint : integer;
function Startup( ulID: LongWord; hValue: Longint ): LongWord; stdcall;
begin
threadint := 123;
Result := 0;
end;
Thanks.
As you've already determined, thread-local storage will get released for each thread that gets detached from the DLL. That happens in System._StartLib
when Reason
is DLL_Thread_Detach
. For that to happen, though, the thread needs to terminate. Thread-detach notifications occur when the thread terminates, not when the DLL is unloaded. (If it were the other way around, the OS would have to interrupt the thread someplace so it could insert a call to DllMain
on the thread's behalf. That would be disastrous.)
The DLL is supposed to receive thread-detach notifications. In fact, that's the model suggested by Microsoft in its description of how to use thread-local storage with DLLs.
The only way to release thread-local storage is to call TlsFree
from the context of the thread whose storage you want to free. From what I can tell, Delphi keeps all its threadvars in a single TLS index, given by the TlsIndex
variable in SysInit.pas. You can use that value to call TlsFree
whenever you want, but you'd better be sure there won't be any more code executed by the DLL in the current thread.
Since you also want to free the memory used for holding all the threadvars, you'll need to call TlsGetValue
to get the address of the buffer Delphi allocates. Call LocalFree
on that pointer.
This would be the (untested) Delphi code to free the thread-local storage.
var
TlsBuffer: Pointer;
begin
TlsBuffer := TlsGetValue(SysInit.TlsIndex);
LocalFree(HLocal(TlsBuffer));
TlsFree(SysInit.TlsIndex);
end;
If you need to do this from the host application instead of from within the DLL, then you'll need to export a function that returns the DLL's TlsIndex
value. That way, the host program can free the storage itself after the DLL is gone (thus guaranteeing no further DLL code executes in a given thread).
Note that it is clearly specified in the Help that you have to take care of freeing yourself your threadvars.
You should do so as soon as you know you won't need them anymore.
From Help:
Dynamic variables that are ordinarily managed by the compiler (long strings, wide strings, dynamic arrays, variants, and interfaces) can be declared with threadvar, but the compiler does not automatically free the heap-allocated memory created by each thread of execution. If you use these data types in thread variables, it is your responsibility to dispose of their memory from within the thread, before the thread terminates. For example,
threadvar S: AnsiString;
S := 'ABCDEFGHIJKLMNOPQRSTUVWXYZ';
...
S := ''; // free the memory used by S
Note: Use of such constructs is discouraged.
You can free a variant by setting it to Unassigned and an interface or dynamic array by setting it to nil.
At the risk of way too much code, here is a possible (poor) solution to my own question. Using the fact that the thread-local storage memory is stored in a single block for the threadvar variables (as pointed out by Mr. Kennedy - thanks), this code stores the allocated pointers in a TList and then frees them at process detach. I wrote it mostly just to see if it would work. I probably would not use this in production code because it makes assumptions about the Delphi runtime that could change with different versions and quite possibly misses problems even with the version I am using (Delphi 7 and 2007).
This implementation does make umdh happy, it doesn't think there are any more memory leaks. However, if I run the test in a loop (load, call entrypoint on another thread, unload), the memory usage as seen in Process Explorer still grows alarmingly fast. In fact, I created a completely empty DLL with only an empty DllMain (that was not called since I did not assign Delphi's global DllMain pointer to it ... Delhi itself provides the real DllMain entrypoint). A simple loop of loading/unloading the DLL still leaked 4K per iteration. So there may still be something else a Delphi DLL is supposed to include (the main point of the original question). But I don't know what it is. A DLL written in C does not behave this way.
Our code (a server) can call DLLs written by customers to extend functionality. We typically unload the DLL after there are no more references to it. I think my solution to the problem is going to be to add an option to leave the DLL loaded "permanently" in memory. If customers use Delphi to write their DLL, they will need to turn that option on (or maybe we can detect that it is a Delphi DLL on load ... need to check that out). Nonetheless, it has been an interesting exercise.
library Sample;
uses
SysUtils,
Windows,
Classes,
HTTPApp,
SyncObjs;
{$E dll}
var
gListSync : TCriticalSection;
gTLSList : TList;
threadvar
threadint : integer;
// remove all entries from the TLS storage list
procedure RemoveAndFreeTLS();
var
i : integer;
begin
// Only call this at process detach. Those calls are serialized
// so don't get the critical section.
if assigned( gTLSList ) then
for i := 0 to gTLSList.Count - 1 do
// Is this actually safe in DllMain process detach? From reading the MSDN
// docs, it appears that the only safe statement in DllMain is "return;"
LocalFree( Cardinal( gTLSList.Items[i] ));
end;
// Remove this thread's entry
procedure RemoveThreadTLSEntry();
var
p : pointer;
begin
// Find the entry for this thread and remove it.
gListSync.enter;
try
if ( SysInit.TlsIndex <> -1 ) and ( assigned( gTLSList )) then
begin
p := TlsGetValue( SysInit.TlsIndex );
// if this thread didn't actually make a call into the DLL and use a threadvar
// then there would be no memory for it
if p <> nil then
gTLSList.Remove( p );
end;
finally
gListSync.leave;
end;
end;
// Add current thread's TLS pointer to the global storage list if it is not already
// stored in it.
procedure AddThreadTLSEntry();
var
p : pointer;
begin
gListSync.enter;
try
// Need to create the list if first call
if not assigned( gTLSList ) then
gTLSList := TList.Create;
if SysInit.TlsIndex <> -1 then
begin
p := TlsGetValue( SysInit.TlsIndex );
if p <> nil then
begin
// if it is not stored, add it
if gTLSList.IndexOf( p ) = -1 then
gTLSList.Add( p );
end;
end;
finally
gListSync.leave;
end;
end;
// Some entrypoint that uses threadvar (directly or indirectly)
function MyExportedFunc(): LongWord; stdcall;
begin
threadint := 123;
// Make sure this thread's TLS pointer is stored in our global list so
// we can free it at process detach. Do this AFTER using the threadvar.
// Delphi seems to allocate the memory on demand.
AddThreadTLSEntry;
Result := 0;
end;
procedure DllMain(reason: integer) ;
begin
case reason of
DLL_PROCESS_DETACH:
begin
// NOTE - if this is being called due to process termination, then it should
// just return and do nothing. Very dangerous (and against MSDN recommendations)
// otherwise. However, Delphi does not provide that information (the 3rd param of
// the real DlLMain entrypoint). In my test, though, I know this is only called
// as a result of the DLL being unloaded via FreeLibrary
RemoveAndFreeTLS();
gListSync.Free;
if assigned( gTLSList ) then
gTLSList.Free;
end;
DLL_THREAD_DETACH:
begin
// on a thread detach, Delphi will clean up its own TLS, so we just
// need to remove it from the list (otherwise we would get a double free
// on process detach)
RemoveThreadTLSEntry();
end;
end;
end;
exports
DllMain,
MyExportedFunc;
// Initialization
begin
IsMultiThread := TRUE;
// Make sure Delphi calls my DllMain
DllProc := @DllMain;
// sync object for managing TLS pointers. Is it safe to create a critical section?
// This init code is effectively DllMain's DLL_PROCESS_ATTACH
gListSync := TCriticalSection.Create;
end.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With