I want to implement efficient single-threaded socket communication using "epoll
"-style event management.
If I were to write a very imperative program "from scratch," I would do it basically like this (just some pseudo-ish code I just typed out - probably won't compile):
import Control.Concurrent
import Data.ByteString (ByteString)
import qualified Data.ByteString as ByteString
import qualified GHC.Event as Event
import Network
import Network.Socket
import Network.Socket.ByteString
main = withSocketFromSomewhere $ \ socket -> do
let fd = fromIntegral . fdSocket $ socket
-- Some app logic
state <- newMVar "Bla"
-- Event manager
manager <- Event.new
-- Do an initial write
initialWrite socket state manager
-- Manager does its thing
Event.loop manager
write manager socket bs =
-- Should be pretty straight-forward
Event.registerFd manager theWrite fd Event.evtWrite
where
fd = fromIntegral . fdSocket $ socket
theWrite key _ = do
Event.unregisterFd manager key
sendAll socket bs
read manager socket cont =
-- Ditto
Event.registerFd manager theRead fd Event.evtRead
where
fd = fromIntegral . fdSocket $ socket
theRead key _ = do
Event.unregisterFd manager key
bs <- recv socket 4096
cont bs
initialWrite socket state manager = do
msg <- readMVar state
write manager socket msg
read manager socket $ \ bs -> do
ByteString.putStrLn bs
putMVar state msg
Imagine that there also are some functions that add timeout events to the manager, and such.
However, this code isn't particularly nice, for a few reasons:
MVar
for my application logic, because I can't tell the opaque event manager that it should pass around some state for me, even though I know that it only uses one thread and therefore could potentially be used as the base of a monad transformer stack.Now, this just screams for the use of a plethora of monad transformers etc. I'd like to be able to just do this:
main =
withSocketFromSomewhere $ \ socket ->
runEvents . flip runStateT "Bla" $ initialWrite socket
initialWrite socket = do
msg <- lift get
write socket msg
resp <- read socket
liftIO $ ByteString.putStrLn resp
lift $ put msg
This code should have the same behavior as the above code; e.g. by suspending the computation until a read has been received on the resp <- read socket
line, and letting me manage multiple sockets on the same thread/manager.
Questions:
The I/O monad contains primitives which build composite actions, a process similar to joining statements in sequential order using `;' in other languages. Thus the monad serves as the glue which binds together the actions in a program.
Monads are simply a way to wrapping things and provide methods to do operations on the wrapped stuff without unwrapping it. For example, you can create a type to wrap another one, in Haskell: data Wrapped a = Wrap a. To wrap stuff we define return :: a -> Wrapped a return x = Wrap x.
Using Maybe is a good way to deal with errors or exceptional cases without resorting to drastic measures such as error . The Maybe type is also a monad. It is a simple kind of error monad, where all errors are represented by Nothing . A richer error monad can be built using the Either type.
Map is not one of the defining properties of monads, however, because it's technically just a special case of FlatMap. A lifting function like Unit will wrap its object in a container, even if that object is itself the same type of container.
I highly recommend you read A language-based approach to unifying events and threads. It talks about how you can structure any concurrency system you want on top of your IO subsystem of choice and in their paper they actually implement it on top of epoll
.
Unfortunately, the data type and code examples in the paper are incredibly poor and it took some time (at least, for me) to reverse engineer their code, and there are even some mistakes in their paper. However, their approach is actually a subset of a very powerful and general approach known as "free monads".
For example, their Trace
data type is just a free monad in disguise. To see why, let's consult the Haskell definition of a Free monad:
data Free f r = Pure r | Free (f (Free f r))
A free monad is like a "list of functors", where Pure
is analogous to a list's Nil
constructor and Free
is analogous to a list's Cons
constructor because it prepends an additional functor onto the "list". Technically, if I were pedantic, there is nothing that says a free monad has to be implemented as the above list-like data type, but whatever you implement has to be isomorphic to the above data type.
The nice thing about a free monad is that, given a functor f
, Free f
is automatically a monad:
instance (Functor f) => Monad (Free f) where
return = Pure
Pure r >>= f = f r
Free x >>= f = Free (fmap (>>= f) x)
That means we can decompose their Trace
data type into two parts, the base functor f
and then the free monad generated by f
:
-- The base functor
data TraceF x =
SYS_NBIO (IO x)
| SYS_FORK x x
| SYS_YIELD x
| SYS_RET
| SYS_EPOLL_WAIT FD EPOLL_EVENT x
-- You can even skip this definition if you use the GHC
-- "DerivingFunctor" extension
instance Functor TraceF where
fmap f (SYS_NBIO x) = SYS_NBIO (liftM f x)
fmap f (SYS_FORK x) = SYS_FORK (f x) (f x)
fmap f (SYS_YIELD x) = SYS_YIELD (f x)
fmap f SYS_RET = SYS_RET
fmap f (SYS_EPOLL_WAIT FD EPOLL_EVENT x) = SYS_EPOLL_WAIT FD EPOLL_EVEN (f x)
Given that functor, you get the Trace
monad "for free":
type Trace a = Free TraceF a
-- or: type Trace = Free TraceF
... although that's not why it's called the "free" monad.
It then is easier to define all their functions:
liftF = Free . fmap Pure
-- if "Free f" is like a list of "f", then
-- this is sort of like: "liftF x = [x]"
-- it's just a convenience function
-- their definitions are written in continuation-passing style,
-- presumably for efficiency, but they are equivalent to these
sys_nbio io = liftF (SYS_NBIO io)
sys_fork t = SYS_FORK t (return ()) -- intentionally didn't use liftF
sys_yield = liftF (SYS_YIELD ())
sys_ret = liftF SYS_RET
sys_epoll_wait fd event = liftF (SYS_EPOLL_WAIT fd event ())
So then you can use these commands just like a monad:
myTrace fd event = do
sys_nbio (putStrLn "Hello, world")
fork $ do
sys_nbio (putStrLn "Hey")
sys_expoll_wait fd event
Now, here's the key concept. That monad I just wrote only creates a data type. THat's it. It doesn't interpret it at all. It's just like how you would write an abstract syntax tree for an expression. It's completely up to you how you want to evaluate it. In the paper they give a concrete example of an interpreter for the expression, but it's trivial to write your own.
The important concept is that this interpreter can run in any monad you want. So if you want to thread some state through your concurrency, you can do that. For example, here's a toy interpreter which uses the StateT IO monad to keep track of how many times an IO action was called:
interpret t = case t of
SYS_NBIO io -> do
modify (+1)
t' <- lift io
interpret t'
...
You can even thread monads across forkIO'd actions! Here is some very old code of mine which is buggy and lame because it was written back when I was much less experienced and had no idea what free monads were, but it demonstrates this in action:
module Thread (Thread(..), done, lift, branch, fork, run) where
import Control.Concurrent
import Control.Concurrent.STM
import Control.Monad.Cont
import Data.Sequence
import qualified Data.Foldable as F
data Thread f m =
Done
| Lift (m (Thread f m))
| LiftIO (IO (Thread f m))
| Branch (f (Thread f m))
| Exit
done = cont $ \c -> Done
lift' x = cont $ \c -> Lift $ liftM c x
liftIO' x = cont $ \c -> LiftIO $ liftM c x
branch x = cont $ \c -> Branch $ fmap c x
exit = cont $ \c -> Exit
fork x = join $ branch [return (), x >> done]
run x = do
q <- liftIO $ newTChanIO
enqueue q $ runCont x $ \_ -> Done
loop q
where
loop q = do
t <- liftIO $ atomically $ readTChan q
case t of
Exit -> return ()
Done -> loop q
Branch ft -> mapM_ (enqueue q) ft >> loop q
Lift mt -> (mt >>= enqueue q) >> loop q
LiftIO it -> (liftIO $ forkIO $ it >>= enqueue q) >> loop q
enqueue q = liftIO . atomically . writeTChan q
The point behind free monads is that they provide the monad instance AND NOTHING ELSE. In other words, they step back and give you complete freedom how you want to interpret them, which is why they are so incredibly useful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With