I’d like to play tricks with forkProcess
, where I want to clone my Haskell process, and then let both clones talk to each other (maybe using Cloud Haskell to send even closures around).
But I wonder how well that works with the GHC runtime. Does anyone have experience here?
The documenation for forkProcess
says that no other threads are copied, so I assume all data used by other threads will then be garbage collected in the fork, which sounds good. But that means that finalizers will run in both clone, which may or may not be the right thing to do…
I assume I can’t just use it without worry; but are there rules I can follow that will make sure its use is safe?
It's safe to fork in a multithreaded program as long as you are very careful about the code between fork and exec. You can make only re-enterant (aka asynchronous-safe) system calls in that span.
fork creates a new process. The parent of a process is another process, not a thread. So the parent of the new process is the old process. Note that the child process will only have one thread because fork only duplicates the (stack for the) thread that calls fork .
In the computing field, fork() is the primary method of process creation on Unix-like operating systems. This function creates a new copy called the child out of the original process, that is called the parent. When the parent process closes or crashes for some reason, it also kills the child process.
The fork() creates a child process by duplicating the calling process. The process that invoked fork() is the parent process and the newly created process is the child process.
But that means that finalizers will run in both clone, which may or may not be the right thing to do…
Finalizers are very rarely used in Haskell, and even where they are used, I would expect them to only have in-process effects. For example, a finalizer calls hClose
on garbage-collected Handles if you forgot to do it yourself. This is easy to demonstrate: the following program fails with openFile: resource exhausted (Too many open files)
, but if you uncomment the pure ()
, the Handles get garbage-collected and the program completes successfully.
import Control.Concurrent
import Control.Monad
import System.IO
import System.Mem
main :: IO ()
main = do
rs <- replicateM 1000 $ do
threadDelay 1000 -- not sure why did is needed; maybe to give control back
-- to the OS, so it can recycle the file descriptors?
performGC
openFile "input" ReadMode
--pure ()
print rs -- force all the Handles to still be alive by this point
File descriptors are process-owned and are copied by forkProcess
, so it makes sense to have each clone close their copies.
The case which would be problematic is if a finalizer was cleaning up a system-owned resource, e.g. deleting a file. But I hope no library is relying on finalizers to delete such resources, because as the documentation explains, finalizers are not guaranteed to run. So it's better to use something like bracket
to cleanup resources (although the cleanup is still not guaranteed, e.g. if bracket
is used from a thread).
What the documentation for forkProcess
is warning about is not finalizers, but the fact that other threads will appear to end abruptly inside the forked process. This is especially problematic if those threads are holding locks. Normally, two threads can use modifyMVar_
to ensure that only one thread at a time is running a critical section, and as long as each thread is only holding the lock for a finite amount of time, the other thread can simply wait for the MVar
to become available. If you call forkProcess
while one thread is in the middle of a modifyMVar_
, however, that thread will not continue in the cloned process, and so the cloned process cannot simply call modifyMVar_
or it could get stuck forever while waiting for a non-existing thread to release the lock. Here is a program demonstrating the problem.
import Control.Concurrent
import Control.Monad
import System.Posix.Process
-- >>> main
-- (69216,"forkIO thread",0)
-- (69216,"main thread",1)
-- (69216,"forkIO thread",2)
-- (69216,"main thread",3)
-- (69216,"forkIO thread",4)
-- (69216,"main thread",5)
-- calling forkProcess
-- forkProcess main thread waiting for MVar...
-- (69216,"forkIO thread",6)
-- (69216,"original main thread",7)
-- (69216,"forkIO thread",8)
-- (69216,"original main thread",9)
-- (69216,"forkIO thread",10)
-- (69216,"original main thread",11)
main :: IO ()
main = do
mvar <- newMVar (0 :: Int)
_ <- forkIO $ replicateM_ 6 $ do
modifyMVar_ mvar $ \i -> do
threadDelay 100000
processID <- getProcessID
print (processID, "forkIO thread", i)
pure (i+1)
threadDelay 50000
replicateM_ 3 $ do
modifyMVar_ mvar $ \i -> do
threadDelay 100000
processID <- getProcessID
print (processID, "main thread", i)
pure (i+1)
putStrLn "calling forkProcess"
_ <- forkProcess $ do
threadDelay 25000
replicateM_ 3 $ do
putStrLn "forkProcess main thread waiting for MVar..."
modifyMVar_ mvar $ \i -> do
threadDelay 100000
processID <- getProcessID
print (processID, "forkProcess main thread", i)
pure (i+1)
replicateM_ 3 $ do
modifyMVar_ mvar $ \i -> do
threadDelay 100000
processID <- getProcessID
print (processID, "original main thread", i)
pure (i+1)
threadDelay 100000
As the output shows, the forkProcess main thread gets stuck waiting forever for the MVar, and never prints the forkProcess main thread
line. If you move the threadDelay
s outside the modifyMVar_
critical section, the forkIO thread is a lot less likely to be in the middle of that critical section when forkProcess
is called, so you'll see an output which looks like this instead:
(69369,"forkIO thread",0)
(69369,"main thread",1)
(69369,"forkIO thread",2)
(69369,"main thread",3)
(69369,"forkIO thread",4)
(69369,"main thread",5)
calling forkProcess
(69369,"forkIO thread",6)
(69369,"original main thread",7)
forkProcess main thread waiting for MVar...
(69370,"forkProcess main thread",6)
(69369,"forkIO thread",8)
(69369,"original main thread",9)
forkProcess main thread waiting for MVar...
(69370,"forkProcess main thread",7)
(69369,"forkIO thread",10)
(69369,"original main thread",11)
forkProcess main thread waiting for MVar...
(69370,"forkProcess main thread",8)
After the forkProcess
call, there are now two MVars which both hold the value 5, and so in the original process, original main thread
and forkIO thread
are both incrementing one MVar, while in the other process forkProcess main thread
is incrementing the other.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With