Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ManualResetEvent.WaitOne() doesn't return if Reset() is called immediately after Set()

I have a problem in a production service which contains a "watchdog" timer used to check whether the main processing job has become frozen (this is related to a COM interop problem which unfortunately can't be reproduced in test).

Here's how it currently works:

  • During processing, the main thread resets a ManualResetEvent, processes a single item (this shouldn't take long), then sets the event. It then continues to process any remaining items.
  • Every 5 minutes, the watchdog calls WaitOne(TimeSpan.FromMinutes(5)) on this event. If the result is false, the service is restarted.
  • Sometimes, during normal operation, the service is being restarted by this watchdog even though processing takes nowhere near 5 minutes.

The cause appears to be that when multiple items await processing, the time between the Set() after the first item is processed, and the Reset() before the second item is processed is too brief, and WaitOne() doesn't appear to recognise that the event has been set.

My understanding of WaitOne() is that the blocked thread is guaranteed to receive a signal when Set() is called, but I assume I'm missing something important.

Note that if I allow a context switch by calling Thread.Sleep(0) after calling Set(), WaitOne() never fails.

Included below is a sample which produces the same behaviour as my production code. WaitOne() sometimes waits for 5 seconds and fails, even though Set() is being called every 800 milliseconds.

private static ManualResetEvent _handle;

private static void Main(string[] args)
{
    _handle = new ManualResetEvent(true);

    ((Action) PeriodicWait).BeginInvoke(null, null);
    ((Action) PeriodicSignal).BeginInvoke(null, null);

    Console.ReadLine();
}

private static void PeriodicWait()
{
    Stopwatch stopwatch = new Stopwatch();

    while (true)
    {
        stopwatch.Restart();
        bool result = _handle.WaitOne(5000, false);
        stopwatch.Stop();
        Console.WriteLine("After WaitOne: {0}. Waited for {1}ms", result ? "success" : "failure",
                            stopwatch.ElapsedMilliseconds);
        SpinWait.SpinUntil(() => false, 1000);
    }
}

private static void PeriodicSignal()
{
    while (true)
    {
        _handle.Reset();
        Console.WriteLine("After Reset");
        SpinWait.SpinUntil(() => false, 800);
        _handle.Set();
        // Uncommenting either of the lines below prevents the problem
        //Console.WriteLine("After Set");
        //Thread.Sleep(0);
    }
}

output of the above code


The Question

While I understand that calling Set() closely followed by Reset() doesn't guarantee that all blocked threads will resume, is it also not guaranteed that any waiting threads will be released?

like image 823
Simon MᶜKenzie Avatar asked Dec 03 '22 00:12

Simon MᶜKenzie


1 Answers

No, this is fundamentally broken code. There are only reasonable odds that the WaitOne() will complete when you keep the MRE set for such a short amount of time. Windows favors releasing a thread that's blocked on an event. But this will drastically fail when the thread isn't waiting. Or the scheduler picks another thread instead, one that runs with a higher priority and also got unblocked. Could be a kernel thread for example. MRE doesn't keep a "memory" of having been signaled and not yet waited on.

Neither Sleep(0) or Sleep(1) are good enough to guarantee that the wait is going to complete, there's no reasonable upper bound on how often the waiting thread could be bypassed by the scheduler. Although you probably ought to shut down the program when it takes longer than 10 seconds ;)

You'll need to do this differently. A simple way is to rely on the worker to eventually set the event. So reset it before you start waiting:

private static void PeriodicWait() {
    Stopwatch stopwatch = new Stopwatch();

    while (true) {
        stopwatch.Restart();
        _handle.Reset();
        bool result = _handle.WaitOne(5000);
        stopwatch.Stop();
        Console.WriteLine("After WaitOne: {0}. Waited for {1}ms", result ? "success" : "failure",
                            stopwatch.ElapsedMilliseconds);
    }
}

private static void PeriodicSignal() {
    while (true) {
        _handle.Set();
        Thread.Sleep(800);   // Simulate work
    }
}
like image 190
Hans Passant Avatar answered Mar 15 '23 06:03

Hans Passant