I’d like to play tricks with <code>forkProcess</code>, where I want to clone my Haskell process, and then let both clones talk to each other (maybe using Cloud Haskell to send even closures around). But I wonder how well that works with the GHC runtime. Does anyone have experience here? The documenation for <code>forkProcess</code> says that no other threads are copied, so I assume all data used by other threads will then be garbage collected in the fork, which sounds good. But that means that finalizers will run in both clone, which may or may not be the right thing to do… I assume I can’t just use it without worry; but are there rules I can follow that will make sure its use is safe?

<blockquote> But that means that finalizers will run in both clone, which may or may not be the right thing to do… </blockquote> Finalizers are very rarely used in Haskell, and even where they are used, I would expect them to only have in-process effects. For example, a finalizer calls <code>hClose</code> on garbage-collected Handles if you forgot to do it yourself. This is easy to demonstrate: the following program fails with <code>openFile: resource exhausted (Too many open files)</code>, but if you uncomment the <code>pure ()</code>, the Handles get garbage-collected and the program completes successfully. <pre class="prettyprint"><code>import Control.Concurrent import Control.Monad import System.IO import System.Mem main :: IO () main = do rs <- replicateM 1000 $ do threadDelay 1000 -- not sure why did is needed; maybe to give control back -- to the OS, so it can recycle the file descriptors? performGC openFile "input" ReadMode --pure () print rs -- force all the Handles to still be alive by this point </code></pre> File descriptors are process-owned and are copied by <code>forkProcess</code>, so it makes sense to have each clone close their copies. The case which would be problematic is if a finalizer was cleaning up a system-owned resource, e.g. deleting a file. But I hope no library is relying on finalizers to delete such resources, because as the documentation explains, finalizers are not guaranteed to run. So it's better to use something like <code>bracket</code> to cleanup resources (although the cleanup is still not guaranteed, e.g. if <code>bracket</code> is used from a thread). What the documentation for <code>forkProcess</code> is warning about is not finalizers, but the fact that other threads will appear to end abruptly inside the forked process. This is especially problematic if those threads are holding locks. Normally, two threads can use <code>modifyMVar_</code> to ensure that only one thread at a time is running a critical section, and as long as each thread is only holding the lock for a finite amount of time, the other thread can simply wait for the <code>MVar</code> to become available. If you call <code>forkProcess</code> while one thread is in the middle of a <code>modifyMVar_</code>, however, that thread will not continue in the cloned process, and so the cloned process cannot simply call <code>modifyMVar_</code> or it could get stuck forever while waiting for a non-existing thread to release the lock. Here is a program demonstrating the problem. <pre class="prettyprint"><code>import Control.Concurrent import Control.Monad import System.Posix.Process -- >>> main -- (69216,"forkIO thread",0) -- (69216,"main thread",1) -- (69216,"forkIO thread",2) -- (69216,"main thread",3) -- (69216,"forkIO thread",4) -- (69216,"main thread",5) -- calling forkProcess -- forkProcess main thread waiting for MVar... -- (69216,"forkIO thread",6) -- (69216,"original main thread",7) -- (69216,"forkIO thread",8) -- (69216,"original main thread",9) -- (69216,"forkIO thread",10) -- (69216,"original main thread",11) main :: IO () main = do mvar <- newMVar (0 :: Int) _ <- forkIO $ replicateM_ 6 $ do modifyMVar_ mvar $ \i -> do threadDelay 100000 processID <- getProcessID print (processID, "forkIO thread", i) pure (i+1) threadDelay 50000 replicateM_ 3 $ do modifyMVar_ mvar $ \i -> do threadDelay 100000 processID <- getProcessID print (processID, "main thread", i) pure (i+1) putStrLn "calling forkProcess" _ <- forkProcess $ do threadDelay 25000 replicateM_ 3 $ do putStrLn "forkProcess main thread waiting for MVar..." modifyMVar_ mvar $ \i -> do threadDelay 100000 processID <- getProcessID print (processID, "forkProcess main thread", i) pure (i+1) replicateM_ 3 $ do modifyMVar_ mvar $ \i -> do threadDelay 100000 processID <- getProcessID print (processID, "original main thread", i) pure (i+1) threadDelay 100000 </code></pre> As the output shows, the forkProcess main thread gets stuck waiting forever for the MVar, and never prints the <code>forkProcess main thread</code> line. If you move the <code>threadDelay</code>s outside the <code>modifyMVar_</code> critical section, the forkIO thread is a lot less likely to be in the middle of that critical section when <code>forkProcess</code> is called, so you'll see an output which looks like this instead: <pre class="prettyprint"><code>(69369,"forkIO thread",0) (69369,"main thread",1) (69369,"forkIO thread",2) (69369,"main thread",3) (69369,"forkIO thread",4) (69369,"main thread",5) calling forkProcess (69369,"forkIO thread",6) (69369,"original main thread",7) forkProcess main thread waiting for MVar... (69370,"forkProcess main thread",6) (69369,"forkIO thread",8) (69369,"original main thread",9) forkProcess main thread waiting for MVar... (69370,"forkProcess main thread",7) (69369,"forkIO thread",10) (69369,"original main thread",11) forkProcess main thread waiting for MVar... (69370,"forkProcess main thread",8) </code></pre> After the <code>forkProcess</code> call, there are now two MVars which both hold the value 5, and so in the original process, <code>original main thread</code> and <code>forkIO thread</code> are both incrementing one MVar, while in the other process <code>forkProcess main thread</code> is incrementing the other.

How dangerous is forkProcess? How can I use it safely?

Tags:

fork

haskell

ghc

I’d like to play tricks with forkProcess, where I want to clone my Haskell process, and then let both clones talk to each other (maybe using Cloud Haskell to send even closures around).

But I wonder how well that works with the GHC runtime. Does anyone have experience here?

The documenation for forkProcess says that no other threads are copied, so I assume all data used by other threads will then be garbage collected in the fork, which sounds good. But that means that finalizers will run in both clone, which may or may not be the right thing to do…

I assume I can’t just use it without worry; but are there rules I can follow that will make sure its use is safe?

673

asked Nov 12 '20 17:11

Joachim Breitner

1 Answers

But that means that finalizers will run in both clone, which may or may not be the right thing to do…

Finalizers are very rarely used in Haskell, and even where they are used, I would expect them to only have in-process effects. For example, a finalizer calls hClose on garbage-collected Handles if you forgot to do it yourself. This is easy to demonstrate: the following program fails with openFile: resource exhausted (Too many open files), but if you uncomment the pure (), the Handles get garbage-collected and the program completes successfully.

import Control.Concurrent
import Control.Monad
import System.IO
import System.Mem

main :: IO ()
main = do
  rs <- replicateM 1000 $ do
    threadDelay 1000  -- not sure why did is needed; maybe to give control back
                      -- to the OS, so it can recycle the file descriptors?
    performGC
    openFile "input" ReadMode
    --pure ()
  print rs  -- force all the Handles to still be alive by this point

File descriptors are process-owned and are copied by forkProcess, so it makes sense to have each clone close their copies.

The case which would be problematic is if a finalizer was cleaning up a system-owned resource, e.g. deleting a file. But I hope no library is relying on finalizers to delete such resources, because as the documentation explains, finalizers are not guaranteed to run. So it's better to use something like bracket to cleanup resources (although the cleanup is still not guaranteed, e.g. if bracket is used from a thread).

What the documentation for forkProcess is warning about is not finalizers, but the fact that other threads will appear to end abruptly inside the forked process. This is especially problematic if those threads are holding locks. Normally, two threads can use modifyMVar_ to ensure that only one thread at a time is running a critical section, and as long as each thread is only holding the lock for a finite amount of time, the other thread can simply wait for the MVar to become available. If you call forkProcess while one thread is in the middle of a modifyMVar_, however, that thread will not continue in the cloned process, and so the cloned process cannot simply call modifyMVar_ or it could get stuck forever while waiting for a non-existing thread to release the lock. Here is a program demonstrating the problem.

import Control.Concurrent
import Control.Monad
import System.Posix.Process

-- >>> main
-- (69216,"forkIO thread",0)
-- (69216,"main thread",1)
-- (69216,"forkIO thread",2)
-- (69216,"main thread",3)
-- (69216,"forkIO thread",4)
-- (69216,"main thread",5)
-- calling forkProcess
-- forkProcess main thread waiting for MVar...
-- (69216,"forkIO thread",6)
-- (69216,"original main thread",7)
-- (69216,"forkIO thread",8)
-- (69216,"original main thread",9)
-- (69216,"forkIO thread",10)
-- (69216,"original main thread",11)
main :: IO ()
main = do
  mvar <- newMVar (0 :: Int)
  _ <- forkIO $ replicateM_ 6 $ do
    modifyMVar_ mvar $ \i -> do
      threadDelay 100000
      processID <- getProcessID
      print (processID, "forkIO thread", i)
      pure (i+1)
  threadDelay 50000
  replicateM_ 3 $ do
    modifyMVar_ mvar $ \i -> do
      threadDelay 100000
      processID <- getProcessID
      print (processID, "main thread", i)
      pure (i+1)
  putStrLn "calling forkProcess"
  _ <- forkProcess $ do
    threadDelay 25000
    replicateM_ 3 $ do
      putStrLn "forkProcess main thread waiting for MVar..."
      modifyMVar_ mvar $ \i -> do
        threadDelay 100000
        processID <- getProcessID
        print (processID, "forkProcess main thread", i)
        pure (i+1)
  replicateM_ 3 $ do
    modifyMVar_ mvar $ \i -> do
      threadDelay 100000
      processID <- getProcessID
      print (processID, "original main thread", i)
      pure (i+1)
  threadDelay 100000

As the output shows, the forkProcess main thread gets stuck waiting forever for the MVar, and never prints the forkProcess main thread line. If you move the threadDelays outside the modifyMVar_ critical section, the forkIO thread is a lot less likely to be in the middle of that critical section when forkProcess is called, so you'll see an output which looks like this instead:

(69369,"forkIO thread",0)
(69369,"main thread",1)
(69369,"forkIO thread",2)
(69369,"main thread",3)
(69369,"forkIO thread",4)
(69369,"main thread",5)
calling forkProcess
(69369,"forkIO thread",6)
(69369,"original main thread",7)
forkProcess main thread waiting for MVar...
(69370,"forkProcess main thread",6)
(69369,"forkIO thread",8)
(69369,"original main thread",9)
forkProcess main thread waiting for MVar...
(69370,"forkProcess main thread",7)
(69369,"forkIO thread",10)
(69369,"original main thread",11)
forkProcess main thread waiting for MVar...
(69370,"forkProcess main thread",8)

After the forkProcess call, there are now two MVars which both hold the value 5, and so in the original process, original main thread and forkIO thread are both incrementing one MVar, while in the other process forkProcess main thread is incrementing the other.

138

answered Nov 15 '22 04:11

gelisam

Related questions
                            
                                How to throw an exception with CallStack?
                            
                                Can Haskell's inline-C return a typedef to a function pointer?
                            
                                Generics.SOP equivalent of everywhere/mkT (replacing products)
                            
                                How do foldr and zipWith (:) work together?
                            
                                Why is my haskell program so slow? Programming in Haskell, game of life
                            
                                haskell [[Char]] to [[Int]]
                            
                                How do I read from standard input again after an EOF?
                            
                                How to make type conversion in Haskell?
                            
                                Is there any significant difference between StateT over Reader and ReaderT over State?
                            
                                Is there a standard Haskell function with type: (Floating a, RealFrac b) => a -> b?
                            
                                Haskell - variable not in scope error - beginner
                            
                                Clarity on Implementation of Continuation Monad Instance
                            
                                Compute Harmonic function lazily
                            
                                Haskell: make a Writer as efficient as normal code when log is not needed
                            
                                How do I show dependency tree for a cabal project
                            
                                How would I abstract Command/Response in an extensible way?
                            
                                cabal install gving errror as LICENSE: openBinaryFile: does not exist (No such file or directory)
                            
                                Unclear why functions from Data.Ratio are not exposed and how to work around
                            
                                How can I use a Parsec parser which has a different stream type than another Parsec parser?
                            
                                Haskell list monad and return ()

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With