"Monad-friendly" event-based IO

Q: How does IO monad work?

The I/O monad contains primitives which build composite actions, a process similar to joining statements in sequential order using `;' in other languages. Thus the monad serves as the glue which binds together the actions in a program.

Q: What is a monad example?

Monads are simply a way to wrapping things and provide methods to do operations on the wrapped stuff without unwrapping it. For example, you can create a type to wrap another one, in Haskell: data Wrapped a = Wrap a. To wrap stuff we define return :: a -> Wrapped a return x = Wrap x.

Tags:

events

haskell

concurrency

sockets

I want to implement efficient single-threaded socket communication using "epoll"-style event management.

If I were to write a very imperative program "from scratch," I would do it basically like this (just some pseudo-ish code I just typed out - probably won't compile):

import Control.Concurrent

import Data.ByteString (ByteString)
import qualified Data.ByteString as ByteString

import qualified GHC.Event as Event

import Network
import Network.Socket
import Network.Socket.ByteString

main = withSocketFromSomewhere $ \ socket -> do
  let fd = fromIntegral . fdSocket $ socket

  -- Some app logic
  state <- newMVar "Bla"

  -- Event manager
  manager <- Event.new

  -- Do an initial write
  initialWrite socket state manager

  -- Manager does its thing
  Event.loop manager

write manager socket bs =
  -- Should be pretty straight-forward
  Event.registerFd manager theWrite fd Event.evtWrite
  where
    fd = fromIntegral . fdSocket $ socket
    theWrite key _ = do
      Event.unregisterFd manager key
      sendAll socket bs

read manager socket cont =
  -- Ditto
  Event.registerFd manager theRead fd Event.evtRead
  where
    fd = fromIntegral . fdSocket $ socket
    theRead key _ = do
      Event.unregisterFd manager key
      bs <- recv socket 4096
      cont bs

initialWrite socket state manager = do
  msg <- readMVar state
  write manager socket msg
  read manager socket $ \ bs -> do
    ByteString.putStrLn bs
    putMVar state msg

Imagine that there also are some functions that add timeout events to the manager, and such.

However, this code isn't particularly nice, for a few reasons:

I carry around the event manager manually.
I have to use an MVar for my application logic, because I can't tell the opaque event manager that it should pass around some state for me, even though I know that it only uses one thread and therefore could potentially be used as the base of a monad transformer stack.
I have to create explicit delimited continuations for reads (And I might even have to do this for writes; I don't know what would be wiser in this situation).

Now, this just screams for the use of a plethora of monad transformers etc. I'd like to be able to just do this:

main =
  withSocketFromSomewhere $ \ socket ->
  runEvents . flip runStateT "Bla" $ initialWrite socket

initialWrite socket = do
  msg <- lift get
  write socket msg
  resp <- read socket
  liftIO $ ByteString.putStrLn resp
  lift $ put msg

This code should have the same behavior as the above code; e.g. by suspending the computation until a read has been received on the resp <- read socket line, and letting me manage multiple sockets on the same thread/manager.

Questions:

Is there a more high-level interface to the GHC events API/libevent/equivalent that gives the user some more power? Is it even worth it considering the synchronous IO scheduling changes that have happened in recent GHCs (I'm on 7.4.1)?
What if I want to implement cooperative concurrency, by for example having one function that always handles reads from the socket, but having this function share the same state monad as the write "thread"? Can this be done with whatever solution from (1)?

299

asked Apr 28 '12 14:04

dflemstr

1 Answers

I highly recommend you read A language-based approach to unifying events and threads. It talks about how you can structure any concurrency system you want on top of your IO subsystem of choice and in their paper they actually implement it on top of epoll.

Unfortunately, the data type and code examples in the paper are incredibly poor and it took some time (at least, for me) to reverse engineer their code, and there are even some mistakes in their paper. However, their approach is actually a subset of a very powerful and general approach known as "free monads".

For example, their Trace data type is just a free monad in disguise. To see why, let's consult the Haskell definition of a Free monad:

data Free f r = Pure r | Free (f (Free f r))

A free monad is like a "list of functors", where Pure is analogous to a list's Nil constructor and Free is analogous to a list's Cons constructor because it prepends an additional functor onto the "list". Technically, if I were pedantic, there is nothing that says a free monad has to be implemented as the above list-like data type, but whatever you implement has to be isomorphic to the above data type.

The nice thing about a free monad is that, given a functor f, Free f is automatically a monad:

instance (Functor f) => Monad (Free f) where
    return = Pure
    Pure r >>= f = f r
    Free x >>= f = Free (fmap (>>= f) x)

That means we can decompose their Trace data type into two parts, the base functor f and then the free monad generated by f:

-- The base functor
data TraceF x =
    SYS_NBIO (IO x)
  | SYS_FORK x x
  | SYS_YIELD x
  | SYS_RET
  | SYS_EPOLL_WAIT FD EPOLL_EVENT x

-- You can even skip this definition if you use the GHC
-- "DerivingFunctor" extension
instance Functor TraceF where
    fmap f (SYS_NBIO x) = SYS_NBIO (liftM f x)
    fmap f (SYS_FORK x) = SYS_FORK (f x) (f x)
    fmap f (SYS_YIELD x) = SYS_YIELD (f x)
    fmap f SYS_RET = SYS_RET
    fmap f (SYS_EPOLL_WAIT FD EPOLL_EVENT x) = SYS_EPOLL_WAIT FD EPOLL_EVEN (f x)

Given that functor, you get the Trace monad "for free":

type Trace a = Free TraceF a
-- or: type Trace = Free TraceF

... although that's not why it's called the "free" monad.

It then is easier to define all their functions:

liftF = Free . fmap Pure
-- if "Free f" is like a list of "f", then
-- this is sort of like: "liftF x = [x]"
-- it's just a convenience function

-- their definitions are written in continuation-passing style,
-- presumably for efficiency, but they are equivalent to these
sys_nbio io = liftF (SYS_NBIO io)
sys_fork t = SYS_FORK t (return ()) -- intentionally didn't use liftF
sys_yield = liftF (SYS_YIELD ())
sys_ret = liftF SYS_RET
sys_epoll_wait fd event = liftF (SYS_EPOLL_WAIT fd event ())

So then you can use these commands just like a monad:

myTrace fd event = do
    sys_nbio (putStrLn "Hello, world")
    fork $ do
        sys_nbio (putStrLn "Hey")
    sys_expoll_wait fd event

Now, here's the key concept. That monad I just wrote only creates a data type. THat's it. It doesn't interpret it at all. It's just like how you would write an abstract syntax tree for an expression. It's completely up to you how you want to evaluate it. In the paper they give a concrete example of an interpreter for the expression, but it's trivial to write your own.

The important concept is that this interpreter can run in any monad you want. So if you want to thread some state through your concurrency, you can do that. For example, here's a toy interpreter which uses the StateT IO monad to keep track of how many times an IO action was called:

interpret t = case t of
    SYS_NBIO io -> do
        modify (+1)
        t' <- lift io
        interpret t'
    ...

You can even thread monads across forkIO'd actions! Here is some very old code of mine which is buggy and lame because it was written back when I was much less experienced and had no idea what free monads were, but it demonstrates this in action:

module Thread (Thread(..), done, lift, branch, fork, run) where

import Control.Concurrent
import Control.Concurrent.STM
import Control.Monad.Cont
import Data.Sequence
import qualified Data.Foldable as F

data Thread f m =
    Done
  | Lift (m (Thread f m))
  | LiftIO (IO (Thread f m))
  | Branch (f (Thread f m))
  | Exit

done = cont $ \c -> Done
lift' x = cont $ \c -> Lift $ liftM c x
liftIO' x = cont $ \c -> LiftIO $ liftM c x
branch x = cont $ \c -> Branch $ fmap c x
exit = cont $ \c -> Exit

fork x = join $ branch [return (), x >> done]

run x = do
    q <- liftIO $ newTChanIO
    enqueue q $ runCont x $ \_ -> Done
    loop q
  where
    loop q = do
        t <- liftIO $ atomically $ readTChan q
        case t of
            Exit -> return ()
            Done -> loop q
            Branch ft -> mapM_ (enqueue q) ft >> loop q
            Lift mt -> (mt >>= enqueue q) >> loop q
            LiftIO it -> (liftIO $ forkIO $ it >>= enqueue q) >> loop q
    enqueue q = liftIO . atomically . writeTChan q

The point behind free monads is that they provide the monad instance AND NOTHING ELSE. In other words, they step back and give you complete freedom how you want to interpret them, which is why they are so incredibly useful.

answered Sep 22 '22 02:09

Gabriella Gonzalez

Related questions
                            
                                Multiple processes listening on the same port?
                            
                                Why does TCP socket programming need two sockets(one welcome socket and one connection socket) but UDP only needs one?
                            
                                Sockets: BufferedReader readLine() blocks
                            
                                How to append new line character in node.js?
                            
                                How do I receive udp packets with winsock in c++?
                            
                                Node Express specific timeout value per route
                            
                                Java, increase the socket timeout
                            
                                how can i stop the block method DatagramSocket.receive() in a thread
                            
                                Java: Efficiency of the readLine method of the BufferedReader and possible alternatives
                            
                                Is there a way to specify the local port to used in tcpClient?
                            
                                Get network address and network mask in Python
                            
                                Does send() always send whole buffer?
                            
                                Servname not supported for ai_socktype
                            
                                socket_recv not receiving full data
                            
                                What is weblogic.socket.Muxer?
                            
                                Epoll and remote 1-way shutdown
                            
                                Java InputStream wait for data.
                            
                                what can cause a socket() "Permission denied" error?
                            
                                Behavior of shutdown(sock, SHUT_RD) with TCP
                            
                                Unix socket credential passing in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With