Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Haskell's bracket function work in executables but fail to clean up in tests?

I'm seeing a very strange behavior where Haskell's bracket function is behaving differently depending on whether stack run or stack test is used.

Consider the following code, where two nested brackets are used to create and clean up Docker containers:

module Main where

import Control.Concurrent
import Control.Exception
import System.Process

main :: IO ()
main = do
  bracket (callProcess "docker" ["run", "-d", "--name", "container1", "registry:2"])
          (\() -> do
              putStrLn "Outer release"
              callProcess "docker" ["rm", "-f", "container1"]
              putStrLn "Done with outer release"
          )
          (\() -> do
             bracket (callProcess "docker" ["run", "-d", "--name", "container2", "registry:2"])
                     (\() -> do
                         putStrLn "Inner release"
                         callProcess "docker" ["rm", "-f", "container2"]
                         putStrLn "Done with inner release"
                     )
                     (\() -> do
                         putStrLn "Inside both brackets, sleeping!"
                         threadDelay 300000000
                     )
          )

When I run this with stack run and interrupt with Ctrl+C, I get the expected output:

Inside both brackets, sleeping!
^CInner release
container2
Done with inner release
Outer release
container1
Done with outer release

And I can verify that both Docker containers are created and then removed.

However, if I paste this exact same code into a test and run stack test, only (part of) the first cleanup happens:

Inside both brackets, sleeping!
^CInner release
container2

This results in a Docker container left running on my machine. What's going on?

  • I've made sure that the exact same ghc-options are passed to both.
  • Full demonstration repo here: https://github.com/thomasjm/bracket-issue
like image 233
tom Avatar asked Jan 14 '20 07:01

tom


1 Answers

When you use stack run, Stack effectively uses an exec system call to transfer control to the executable, so the process for the new executable replaces the running Stack process, much as if you'd run the executable directly from the shell. Here's what the process tree looks like after stack run. Note in particular that the executable is a direct child of the Bash shell. More critically, note that the terminal's foreground process group (TPGID) is 17996, and the only process in that process group (PGID) is the bracket-test-exe process.

PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
13816 13831 13831 13831 pts/3    17996 Ss    2001   0:00  |       \_ /bin/bash --noediting -i
13831 17996 17996 13831 pts/3    17996 Sl+   2001   0:00  |       |   \_ .../.stack-work/.../bracket-test-exe

As a result, when you press Ctrl-C to interrupt the process running either under stack run or directly from the shell, the SIGINT signal is delivered only to the bracket-test-exe process. This raises an asynchronous UserInterrupt exception. The way bracket works, when:

bracket
  acquire
  (\() -> release)
  (\() -> body)

receives an asynchronous exception while processing body, it runs release and then re-raises the exception. With your nested bracket calls, this has the effect of interrupting the inner body, processing the inner release, re-raising the exception to interrupt the outer body, and processing the outer release, and finally re-raising the exception to terminate the program. (If there were more actions following the outer bracket in your main function, they wouldn't be executed.)

On the other hand, when you use stack test, Stack uses withProcessWait to launch the executable as a child process of the stack test process. In the following process tree, note that bracket-test-test is a child process of stack test. Critically, the terminal's foreground process group is 18050, and that process group includes both the stack test process and the bracket-test-test process.

PPID   PID  PGID   SID TTY      TPGID STAT   UID   TIME COMMAND
13816 13831 13831 13831 pts/3    18050 Ss    2001   0:00  |       \_ /bin/bash --noediting -i
13831 18050 18050 13831 pts/3    18050 Sl+   2001   0:00  |       |   \_ stack test
18050 18060 18050 13831 pts/3    18050 Sl+   2001   0:00  |       |       \_ .../.stack-work/.../bracket-test-test

When you hit Ctrl-C in the terminal, the SIGINT signal is sent to all processes in the terminal's foreground process group so both stack test and bracket-test-test get the signal. bracket-test-test will start processing the signal and running finalizers as described above. However, there's a race condition here because when stack test is interrupted, it's in the middle of withProcessWait which is defined more or less as follows:

withProcessWait config f =
  bracket
    (startProcess config)
    stopProcess
    (\p -> f p <* waitExitCode p)

so, when its bracket is interrupted, it calls stopProcess which terminates the child process by sending it the SIGTERM signal. In constrast to SIGINT, this doesn't raise an asynchronous exception. It just terminates the child immediately, generally before it can finish running any finalizers.

I can't think of a particularly easy way to work around this. One way is to use the facilities in System.Posix to put the process into its own process group:

main :: IO ()
main = do
  -- save old terminal foreground process group
  oldpgid <- getTerminalProcessGroupID (Fd 2)
  -- get our PID
  mypid <- getProcessID
  let -- put us in our own foreground process group
      handleInt  = setTerminalProcessGroupID (Fd 2) mypid >> createProcessGroupFor mypid
      -- restore the old foreground process gorup
      releaseInt = setTerminalProcessGroupID (Fd 2) oldpgid
  bracket
    (handleInt >> putStrLn "acquire")
    (\() -> threadDelay 1000000 >> putStrLn "release" >> releaseInt)
    (\() -> putStrLn "between" >> threadDelay 60000000)
  putStrLn "finished"

Now, Ctrl-C will result in SIGINT being delivered only to the bracket-test-test process. It'll clean up, restore the original foreground process group to point to the stack test process, and terminate. This will result in the test failing, and stack test will just keep running.

An alternative would be to try to handle the SIGTERM and keep the child process running to perform cleanup, even once the stack test process has terminated. This is kind of ugly since the process will kind of be cleaning up in the background while you're looking at the shell prompt.

like image 56
K. A. Buhr Avatar answered Nov 02 '22 17:11

K. A. Buhr