Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell computationally intensive thread blocks all other threads

I want to write a program whose main thread forks a new thread for computation and waits on it to finish for a period of time. If the child thread does not finish in given time it is timed out and killed. I have the following code for this.

import Control.Concurrent

fibs :: Int -> Int 
fibs 0 = 0
fibs 1 = 1
fibs n = fibs (n-1) + fibs (n-2)

main = do 
    mvar  <- newEmptyMVar 
    tid   <- forkIO $ do
        threadDelay (1 * 1000 * 1000)
        putMVar mvar Nothing 
    tid'  <- forkIO $ do
        if fibs 1234 == 100
            then putStrLn "Incorrect answer" >> putMVar mvar (Just False)
            else putStrLn "Maybe correct answer" >> putMVar mvar (Just True)
    putStrLn "Waiting for result or timeout"
    result <- takeMVar mvar
    killThread tid
    killThread tid' 

I compiled the above program with ghc -O2 Test.hs and ghc -O2 -threaded Test.hs and ran it but in both cases the program just hangs without printing anything or exiting. If I add a threadDelay (2 * 1000 * 1000) to the computation thread before the if block then the program works as expected and finishes after a second as the timer thread is able to fill the mvar.

Why is threading not working as I expect?

like image 669
Random dude Avatar asked Jan 30 '20 14:01

Random dude


1 Answers

GHC uses a sort of hybrid of cooperative and preemptive multitasking in its concurrency implementation.

At the Haskell level, it seems preemptive because threads don't need to explicitly yield and can be seemingly interrupted by the runtime at any time. But at the runtime level, threads "yield" whenever they allocate memory. Since almost all Haskell threads are constantly allocating, this usually works pretty well.

However, if a particular calculation can be optimized into non-allocating code, it may become uncooperative at the runtime level and so un-preemptible at the Haskell level. As @Carl pointed out, it's actually the -fomit-yields flag, which is implied by -O2 that allows this to happen:

-fomit-yields

Tells GHC to omit heap checks when no allocation is being performed. While this improves binary sizes by about 5%, it also means that threads run in tight non-allocating loops will not get preempted in a timely fashion. If it is important to always be able to interrupt such threads, you should turn this optimization off. Consider also recompiling all libraries with this optimization turned off, if you need to guarantee interruptibility.

Obviously, in the single-threaded runtime (no -threaded flag), this means that one thread can completely starve out all other threads. Less obviously, the same thing can happen even if you compile with -threaded and use +RTS -N options. The problem is that an uncooperative thread can starve out the runtime scheduler itself. If at some point the uncooperative thread is the only thread currently scheduled to run, it will become uninterruptible, and the scheduler will never be re-run to consider scheduling additional threads, even if they could run on other O/S threads.

If you're just trying to test some stuff, change the signature of fib to fib :: Integer -> Integer. Since Integer causes allocation, everything will start working again (with or without -threaded).

If you run into this problem in real code, the easiest solution, by far, is the one suggested by @Carl: if you need to guarantee interruptability of threads, the code should be compiled with -fno-omit-yields, which keeps scheduler calls in non-allocating code. As per the documentation, this increases binary sizes; I assume it comes with a small performance penalty, too.

Alternatively, if the computation is already in IO, then explicitly yielding in the optimized loop may be a good approach. For a pure computation, you could convert it to IO and yield, though usually you can find a simple way to introduce an allocation again. In most realistic situations, there will be a way to introduce only a "few" yields or allocations -- enough to make the thread responsive again but not enough to seriously affect performance. (For example, if you have some nested recursive loops, yield or force an allocation in the outermost loop.)

like image 90
K. A. Buhr Avatar answered Nov 20 '22 01:11

K. A. Buhr