Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

forkIO and coroutines in Haskell

I am trying to understand Coroutines but don't quite understand their purpose given the existence of threads with forkIO. What use cases exactly necessitate using coroutines over threads?

like image 617
user782220 Avatar asked Dec 26 '14 10:12

user782220


2 Answers

It's a bit unclear from your question if you are talking about a specific Haskell coroutines implementation (if yes, please add a link), or about the general concept.

Using forkIO and some kind of inter-thread communication is one way how to implement coroutines. The advantage is that this way you can take the advantage of having multiple CPUs/cores, but in my opinion, there are several drawbacks:

  • Explicit concurrency IO based, so all your computations must run in the IO monad.
  • You have to explicitly implement the inter-thread communication.
  • You have to take care of starting threads and more importantly, of disposing them and of preventing starvation/deadlocks.
  • The architecture is (obviously) multi-threaded. In some circumstances, this could be a disadvantage. For example, you might want your computation to be pure, deterministic, single-threaded, but still use the concept of coroutines.

I'll further assume that your question was about this Coroutine implementation.

Let me give a small example. Let's suppose we want to compute big factorials, but since the computations might take too long, we want it to pause after each cycle so that we can give some feedback to the user. Moreover, we want to signal how many cycles are left to be computed:

import Control.Monad
import Control.Monad.Coroutine
import Control.Monad.Coroutine.SuspensionFunctors
import Control.Parallel
import Data.Functor.Identity

-- A helper function, a monadic version of 'pogoStick':

-- | Runs a suspendable 'Coroutine' to its completion.
pogoStickM :: Monad m => (s (Coroutine s m x) -> m (Coroutine s m x))
                      -> Coroutine s m x -> m x
pogoStickM spring c = resume c >>= either (pogoStickM spring <=< spring) return


factorial1 :: (Monad m) => Integer -> Coroutine (Yield Integer) m Integer
factorial1 = loop 1
  where
    loop r 0 = return r
    loop r n = do
                  let r' = r * n
                  r' `par` yield n
                  (r' `pseq` loop r') (n - 1)


run1 :: IO ()
run1 = pogoStickM (\(Yield i c) -> print i >> return c) (factorial1 20) >>= print

Now let's say we realize that yielding after every cycle is too inefficient. Instead, we want the caller to decide how many cycles we should compute before suspending again. To achieve that, we just replace the Yield functor with Request:

factorial2 :: (Monad m) => Integer
                        -> Coroutine (Request Integer Integer) m Integer
factorial2 n = loop 1 n n
  where
    loop r t 0 = return r
    loop r t n | t >= n       = r' `par` request n >>= rec
               | otherwise    = rec t
      where
        rec t' = (r' `pseq` loop r') t' (n - 1)
        r' = r * n

run2 :: IO ()
run2 = pogoStickM (\(Request i c) -> print i >> return (c (i - 5)))
                  (factorial2 30)
       >>= print

While our run... examples are IO-based, the computations of factorials are pure, they allowe any monad (including Identity).

Still, using Haskell's parallelism we were running the pure computations in parallel with the reporting code (before yielding from the coroutine we create a spark that computes the multiplication step using par).

And perhaps most importantly, the types ensure that the coroutines can't misbehave. There is no way how the coroutines could deadlock - yielding or requesting feedback is always coupled with the appropriate response (unless the caller decides not to continue with the coroutine, in which case it's automatically removed by the garbage collector, there is no blocked thread).

like image 165
Petr Avatar answered Nov 04 '22 08:11

Petr


No use cases truly neccesitate coroutines. Everything you can do with coroutines, you can do with forkIO + some communication channel. In fact, I believe Go (a language in which concurrency is very cheap, like in Haskell) eschews coroutines altogether and does everything with concurrent lightweight threads ("goroutines").

However, sometimes forkIO is overkill. Sometimes you don't need concurrency, you only want to decompose a problem into conceptually separate flows of instructions that yield to each other at certain explicitly determined points.

Consider the task of reading from a file and writing to another. Instead of having a monolithic nested loop, a more reusable solution would be to compose a file-reading coroutine with a file-writing one. When you decide later to print the file to screen instead, you don't need to modify the file-reading coroutine at all, you only have to compose it differently. But notice that this problem has really nothing to do with concurrency, it is about separation of concerns and reusability.

like image 24
danidiaz Avatar answered Nov 04 '22 10:11

danidiaz