The Thread.Join
method has three overloads: Join()
, Join(Int32)
and Join(TimeSpan)
. For each of these three overloads, there is the following statement in the Microsoft doc:
If the thread has already terminated when Join is called, the method returns immediately.
While this statement makes sense for the Join()
overload, it doesn't specify which value is returned for the Join(Int32)
and Join(TimeSpan)
ones, so I tested the Int32
overload in two different environments:
Note that the Linux/Docker implementation is returning true (like the Windows one) if the thread is still running when Join
is called and has terminated after the call. It only returns false if the thread has terminated before the call.
In my opinion Join
should always return true whatever the platform, so what could explain this inconsistent behavior? Am I missing something or is it a .NET 5 bug?
UPDATE
As suggested by @txtechhelp, here is a .NET Fiddle with the exact code I'm testing.
If I run this code on Windows 10 (or in .NET Fiddle) I get the following result:
Starting..
Sleeping 1200..expect T1 end before join
In T1
Leaving T1
Join(100)..expect success
Join(100) success!
Done..
Then if I run this code using mcr.microsoft.com/dotnet/runtime:5.0 on Docker Desktop (v. 3.1.0) then I get the following result:
Starting..
Sleeping 1200..expect T1 end before join
In T1
Leaving T1
Join(100)..expect success
Join(100) failed
Done..
UPDATE 2
Actually after further testing I realized that the test above is only failing if I call the Join
when the Docker application is unloading (i.e. after receiving the AssemblyLoadContext.Default.Unloading
event, which is the signal sent by Docker to inform that it's going to shutdown the application).
So here is the exact test which is even failing on .NET Fiddle:
public class Program
{
public static void Main()
{
System.Runtime.Loader.AssemblyLoadContext.Default.Unloading += (arg) => { OnStopSignalReceived("application unloading"); };
}
public static void T1()
{
System.Console.WriteLine("In T1");
System.Threading.Thread.Sleep(1000);
System.Console.WriteLine("Leaving T1");
}
private static void OnStopSignalReceived(string stopSignalSource)
{
System.Threading.Thread t1 = new System.Threading.Thread(T1);
System.Console.WriteLine("Starting..");
t1.Start();
System.Console.WriteLine("Sleeping 1200..expect T1 end before join");
System.Threading.Thread.Sleep(1200);
System.Console.WriteLine("Join(100)..expect success");
if (t1.Join(100))
{
System.Console.WriteLine("Join(100) success!");
}
else
{
System.Console.WriteLine("Join(100) failed");
}
t1.Join();
System.Console.WriteLine("Done..");
}
}
join() does not do anything to thread t . The only thing it does is wait for thread t to terminate.
Join is a synchronization method that blocks the calling thread (that is, the thread that calls the method) until the thread whose Join method is called has completed. Use this method to ensure that a thread has been terminated. The caller will block indefinitely if the thread does not terminate.
In C#, Thread class provides the Join() method which allows one thread to wait until another thread completes its execution. If t is a Thread object whose thread is currently executing, then t. Join() causes the current thread to pause its execution until thread it joins completes its execution.
join() creates a happens-before relationship: However, if we do not invoke join() or use other synchronization mechanisms, we do not have any guarantee that changes in the other thread will be visible to the current thread even if the other thread has completed.
This issue does appear to be a bug in the underlying .NET 5 code for the AssemblyLoadContext
class or possibly some undefined behavior as not yet specified by the documentation; to wit, the documentation for the AssemblyLoadContext.Unloading
event merely states:
Occurs when the AssemblyLoadContext is unloaded.
That is the only sentence and doesn't provide much context given the issue you're experiencing.
That being said, after some digging around, I wrote 2 versions of the code you supplied and found some interesting behavior dealing with the AssemblyLoadContext.Unloading
and threads.
This code reproduces the bug you mention:
buggy.cs
using System;
using System.Diagnostics;
using System.Threading;
using System.Runtime.Loader;
public class Program
{
static Thread t1 = new Thread(ThreadFn);
static Stopwatch sw = new Stopwatch();
public static void Main()
{
AssemblyLoadContext.Default.Unloading += ContextUnloading;
sw.Start();
Console.WriteLine("{0}ms: leaving main", sw.ElapsedMilliseconds);
}
public static void ThreadFn()
{
Console.WriteLine("{0}ms: in ThreadFn, sleeping 1s", sw.ElapsedMilliseconds);
Thread.Sleep(1000);
Console.WriteLine("{0}ms: leaving ThreadFn", sw.ElapsedMilliseconds);
}
private static void ContextUnloading(AssemblyLoadContext context)
{
Console.WriteLine("{0}ms: unloading '{1}', thread state '{2}'", sw.ElapsedMilliseconds, context, t1.ThreadState);
// possible bug/UB with t1.Start() in this function
Console.WriteLine("{0}ms: starting thread", sw.ElapsedMilliseconds);
t1.Start();
Console.WriteLine("{0}ms: calling Sleep(1200); expect thread in state '{1}' to end before join called", sw.ElapsedMilliseconds, t1.ThreadState);
Thread.Sleep(1200);
Console.WriteLine("{0}ms: calling Join(100) on thread in state '{1}'; expect 'succeeded!'", sw.ElapsedMilliseconds, t1.ThreadState);
Console.WriteLine("Join(100) {0}", (t1.Join(100) ? "succeeded!" : "failed"));
Console.WriteLine("{0}ms: done", sw.ElapsedMilliseconds);
}
}
Run it in dotnetfiddle
Running that code gives me the following results:
0ms: leaving main
12ms: unloading '"Default" System.Runtime.Loader.DefaultAssemblyLoadContext #0', thread state 'Unstarted'
13ms: starting thread
14ms: calling Sleep(1200); expect thread in state 'Running' to end before join called
14ms: in ThreadFn, sleeping 1s
1014ms: leaving ThreadFn
1214ms: calling Join(100) on thread in state 'Stopped'; expect 'succeeded!'
Join(100) failed
1214ms: done
You'll notice in this buggy version, the thread state at each point, and the time lapse, matches with the code presented. The bug happens when the Join(Int32)
method is called; even though the documentation states the return value is a Boolean
where the value is:
true
if the thread has terminated;false
if the thread has not terminated after the amount of time specified by themillisecondsTimeout
parameter has elapsed.
And given the thread was Stopped
, which according to the ThreadState
documentation, means that the thread either responded to an Abort
call (which is not called in the above code), or if
A thread is terminated.
And reading the document Understanding System.Runtime.Loader.AssemblyLoadContext, they make note to even
Be aware of thread races. Loading can be triggered by multiple threads. The AssemblyLoadContext handles thread races by atomically adding assemblies to its cache. The race loser's instance is discarded. In your implementation logic, don't add extra logic that doesn't handle multiple threads properly.
Combine all of that, and we would assume calling Join(Int32)
should give the expected result of true
in the above code.
So, yes, it would appear to be a bug.
However
If you move the thread start in to the Main
function, instead of in the unload event handler, the AssemblyLoadContext.Unloading
on the Default
context does not get called until the thread has finished, and calling Join(Int32)
then, of course, returns the expected result.
It would make sense that the Unloading
event would not be called until after the thread has completed, since it could be considered a "part" of the current assembly context, but it does not explain why the bug in the code above still happens.
So while the Join(100)
call does succeed as expected in the code below, it would appear it's because the AssemblyLoadContext.Unloading
does not get called after Main
exits as one might expect, instead it is called after the thread has finished, which makes contextual sense but is not necessarily noted in any of the documentation.
The 'successful' code:
syncbug.cs
using System;
using System.Diagnostics;
using System.Threading;
using System.Runtime.Loader;
public class Program
{
static Thread t1 = new Thread(ThreadFn);
static Stopwatch sw = new Stopwatch();
public static void Main()
{
AssemblyLoadContext.Default.Unloading += ContextUnloading;
sw.Start();
// Get expected result starting thread, but Unloading isn't called until AFTER the thread
// finishes, which is not the expected result according to the .NET documentation
Console.WriteLine("{0}ms: starting thread", sw.ElapsedMilliseconds);
t1.Start();
Console.WriteLine("{0}ms: leaving main", sw.ElapsedMilliseconds);
}
public static void ThreadFn()
{
Console.WriteLine("{0}ms: in ThreadFn, sleeping 1s", sw.ElapsedMilliseconds);
Thread.Sleep(1000);
Console.WriteLine("{0}ms: leaving ThreadFn", sw.ElapsedMilliseconds);
}
private static void ContextUnloading(AssemblyLoadContext context)
{
Console.WriteLine("{0}ms: unloading '{1}', thread state '{2}'", sw.ElapsedMilliseconds, context, t1.ThreadState);
Console.WriteLine("{0}ms: calling Sleep(1200); expect thread in state '{1}' to end before join called", sw.ElapsedMilliseconds, t1.ThreadState);
Thread.Sleep(1200);
Console.WriteLine("{0}ms: calling Join(100) on thread in state '{1}'; expect 'succeeded!'", sw.ElapsedMilliseconds, t1.ThreadState);
Console.WriteLine("Join(100) {0}", (t1.Join(100) ? "succeeded!" : "failed"));
Console.WriteLine("{0}ms: done", sw.ElapsedMilliseconds);
}
}
Run it in dotnetfiddle
Running that code gives me the following results:
0ms: starting thread
11ms: leaving main
12ms: in ThreadFn, sleeping 1s
1012ms: leaving ThreadFn
1013ms: unloading '"Default" System.Runtime.Loader.DefaultAssemblyLoadContext #0', thread state 'Stopped'
1014ms: calling Sleep(1200); expect thread in state 'Stopped' to end before join called
2214ms: calling Join(100) on thread in state 'Stopped'; expect 'succeeded!'
Join(100) succeeded!
2215ms: done
You'll note the Unload
event is not called until after the thread has completed.
As of the writing of this answer, there are 82 open bugs for the AssemblyLoadContext
class and 346 closed. So it's possible your issues is already noted in some way, but a cursory search didn't result anything that could relate to your issue.
Since this seems to be a legit bug, and since you have more insight to your code and what's happening, I'd recommend going to their Issues page and filing a New issue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With