This was a complete surprise for me. Can someone explain what is the reason behind readIORef
blocking, when there is an atomicModifyIORef
in flight? I understand that the assumption is that the modifying function supplied to the latter function is suppose to be very quick, but that is besides the point.
Here is a sample piece of code that reproduces what I am talking about:
{-# LANGUAGE NumericUnderscores #-}
module Main where
import Control.Concurrent
import Control.Concurrent.Async
import Control.Monad
import Data.IORef
import Say (sayString)
import Data.Time.Clock
import System.IO.Unsafe
main :: IO ()
main = do
ref <- newIORef (10 :: Int)
before <- getCurrentTime
race_ (threadBusy ref 10_000_000) (threadBlock ref)
after <- getCurrentTime
sayString $ "Elapsed: " ++ show (diffUTCTime after before)
threadBlock :: IORef Int -> IO ()
threadBlock ref = do
sayString "Below threads are totally blocked on a busy IORef"
race_ (forever $ sayString "readIORef: Wating ..." >> threadDelay 500_000) $ do
-- need to give a bit of time to ensure ref is set to busy by another thread
threadDelay 100_000
x <- readIORef ref
sayString $ "Unblocked with value: " ++ show x
threadBusy :: IORef Int -> Int -> IO ()
threadBusy ref n = do
sayString $ "Setting IORef to busy for " ++ show n ++ " μs"
y <- atomicModifyIORef' ref (\x -> unsafePerformIO (threadDelay n) `seq` (x * 10000, x))
-- threadDelay is not required above, a simple busy loop that takes a while works just as well
sayString $ "Finished blocking the IORef, returned with value: " ++ show y
Running this piece of code produces:
$ stack exec --package time --package async --package say --force-dirty --resolver nightly -- ghc -O2 -threaded atomic-ref.hs && ./atomic-ref
Setting IORef to busy for 10000000 μs
Below threads are totally blocked on a busy IORef
readIORef: Wating ...
Unblocked with value: 100000
readIORef: Wating ...
Finished blocking the IORef, returned with value: 10
Elapsed: 10.003357215s
Note that readIORef: Wating ...
is printed only twice, once before blocking and one more time afterwards. This is very unexpected, since it is an action that runs in a totally separate thread. This means that blocking on IORef
affects other threads than the one that invoked readIORef
, which is even more surprising.
Are those semantics expected, or is it a bug? I fit is not a bug, why is this expected? I'll open a ghc bug later, unless someone has an explanation for this behavior that I can't think of. I won't be surprised that this is some limitation of ghc runtime, in which case I will provide an answer here later. Regardless of the outcome it is very useful to know about this behavior.
Edit 1
The busy loop I tried that does not require unsafePerformIO
was requested in comments, so here it is
threadBusy :: IORef Int -> Int -> IO ()
threadBusy ref n = do
sayString $ "Setting IORef to busy for " ++ show n ++ " μs"
y <- atomicModifyIORef ref (\x -> busyLoop 10000000000 `seq` (x * 10000, x))
sayString $ "Finished blocking the IORef, returned with value: " ++ show y
busyLoop :: Int -> Int
busyLoop n = go 1 0
where
go acc i
| i < n = go (i `xor` acc) (i + 1)
| otherwise = acc
The outcome is exactly the same, except the runtime is slightly different.
Setting IORef to busy for 10000000 μs
Below threads are totally blocked on a busy IORef
readIORef: Wating ...
Unblocked with value: 100000
readIORef: Wating ...
Finished blocking the IORef, returned with value: 10
Elapsed: 8.545412986s
Edit 2
It turns out that sayString
was the reason for no output not appearing. Here is what the out is when sayString
is swapped for putStrLn
:
Below threads are totally blocked on a busy IORef
Setting IORef to busy for 10000000 μs
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
readIORef: Wating ...
Finished blocking the IORef, returned with value: 10
Unblocked with value: 100000
Elapsed: 10.002272691s
That still does not answer the question, why readIORef
block. In fact I just stumbled upon a quote from a book "Haskell High Performance" By Samuli Thomasson that tells us that blocking should not happen:
I think I understand what happens now. TLDR, readIORef
is not a blocking operation! Big thanks to everyone who commented on the question.
The way I break down the logic mentally is (same as in question, but with added Thread names):
threadBlock :: IORef Int -> IO ()
threadBlock ref = do
race_ ({- Thread C -} forever $ sayString "readIORef: Wating ..." >> threadDelay 500_000) $ do
{- Thread B -}
threadDelay 100_000
x <- readIORef ref
sayString $ "Unblocked with value: " ++ show x
threadBusy :: IORef Int -> Int -> IO ()
threadBusy ref n = do {- Thread A -}
sayString $ "Setting IORef to busy for " ++ show n ++ " μs"
y <- atomicModifyIORef' ref (\x -> unsafePerformIO (threadDelay n) `seq` (x * 10000, x))
sayString $ "Finished blocking the IORef, returned with value: " ++ show y
ref
with a thunk that will be filled when this computation is done unsafePerformIO (threadDelay n) `seq` (x * 10000, x)
. The important part is that because atomicModifyIORef'
is most likely implemented with CAS (compare-and-swap) and the swap was successful, since expected value matched and the new value was updated with the thunk that has not been evaluated yet. Because atomicModifyIORef'
is a strict it has to wait until the value is computed, which will take 10 sec before returning. So thread A blocks. ref
with readIORef
WITHOUT blocking. Now once an attempt is made to print the new content of a thunk x
it has to stop and wait until it is filled with a value, which still is in a process of being computed. Because of that it has to wait thus it looks like it is blocked.sayString
, but it fails to do so and therefore behaved as it was blocked as well. From a quick look say
package and GHC.IO.Handle
it looks like a Handle
for stdout
gets blocked by thread B, because printing in say
package suppose to happen without interleaving and for that reason thread C could not do any printing either, thus it looked like it was blocked as well. That is why switching to putStrLn
unblocked Thread C and allowed it to print a message every 0.5 sec.This definitely convinces me, but if anyone has a better explanation I'll be happy to accept another answer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With