Regarding Haskell's monadic IO
construct:
Is it just a convention or is there is a implementation reason for it?
Could you not just FFI into libc.so instead to do your I/O, and skip the whole IO
-monad thing?
Would it work anyway, or is the outcome nondeterministic because of:
(a) Haskell's lazy evaluation?
(b) another reason, like that the GHC is pattern-matching for the IO
monad and then handling it in a special way (or something else)?
What is the real reason - in the end you end up using a side effect, so why not do it the simple way?
The I/O monad contains primitives which build composite actions, a process similar to joining statements in sequential order using `;' in other languages. Thus the monad serves as the glue which binds together the actions in a program.
Haskell separates pure functions from computations where side effects must be considered by encoding those side effects as values of a particular type. Specifically, a value of type (IO a) is an action, which if executed would produce a value of type a .
What is a Monad? A monad is an algebraic structure in category theory, and in Haskell it is used to describe computations as sequences of steps, and to handle side effects such as state and IO. Monads are abstract, and they have many useful concrete instances. Monads provide a way to structure a program.
A monadic function is a function with a single argument, written to its right. It is one of three possible function valences; the other two are dyadic and niladic. The term prefix function is used outside of APL to describe APL's monadic function syntax.
Yes, monadic I/O is a consequence of Haskell being lazy. Specifically, though, monadic I/O is a consequence of Haskell being pure, which is effectively necessary for a lazy language to be predictable.†
This is easy to illustrate with an example. Imagine for a moment that Haskell were not pure, but it was still lazy. Instead of putStrLn
having the type String -> IO ()
, it would simply have the type String -> ()
, and it would print a string to stdout as a side-effect. The trouble with this is that this would only happen when putStrLn
is actually called, and in a lazy language, functions are only called when their results are needed.
Here’s the trouble: putStrLn
produces ()
. Looking at a value of type ()
is useless, because ()
means “boring”. That means that this program would do what you expect:
main :: () main = case putStr "Hello, " of () -> putStrLn " world!" -- prints “Hello, world!\n”
But I think you can agree that programming style is pretty odd. The case ... of
is necessary, however, because it forces the evaluation of the call to putStr
by matching against ()
. If you tweak the program slightly:
main :: () main = case putStr "Hello, " of _ -> putStrLn " world!"
…now it only prints world!\n
, and the first call isn’t evaluated at all.
This actually gets even worse, though, because it becomes even harder to predict as soon as you start trying to do any actual programming. Consider this program:
printAndAdd :: String -> Integer -> Integer -> Integer printAndAdd msg x y = putStrLn msg `seq` (x + y) main :: () main = let x = printAndAdd "first" 1 2 y = printAndAdd "second" 3 4 in (y + x) `seq` ()
Does this program print out first\nsecond\n
or second\nfirst\n
? Without knowing the order in which (+)
evaluates its arguments, we don’t know. And in Haskell, evaluation order isn’t even always well-defined, so it’s entirely possible that the order in which the two effects are executed is actually completely impossible to determine!
This problem doesn’t arise in strict languages with a well-defined evaluation order, but in a lazy language like Haskell, we need some additional structure to ensure side-effects are (a) actually evaluated and (b) executed in the correct order. Monads happen to be an interface that elegantly provide the necessary structure to enforce that order.
Why is that? And how is that even possible? Well, the monadic interface provides a notion of data dependency in the signature for >>=
, which enforces a well-defined evaluation order. Haskell’s implementation of IO
is “magic”, in the sense that it’s implemented in the runtime, but the choice of the monadic interface is far from arbitrary. It seems to be a fairly good way to encode the notion of sequential actions in a pure language, and it makes it possible for Haskell to be lazy and referentially transparent without sacrificing predictable sequencing of effects.
It’s worth noting that monads are not the only way to encode side-effects in a pure way—in fact, historically, they’re not even the only way Haskell handled side-effects. Don’t be misled into thinking that monads are only for I/O (they’re not), only useful in a lazy language (they’re plenty useful to maintain purity even in a strict language), only useful in a pure language (many things are useful monads that aren’t just for enforcing purity), or that you needs monads to do I/O (you don’t). They do seem to have worked out pretty well in Haskell for those purposes, though.
† Regarding this, Simon Peyton Jones once noted that “Laziness keeps you honest” with respect to purity.
Could you just FFI into libc.so instead to do IO and skip the IO Monad thing?
Taking from https://en.wikibooks.org/wiki/Haskell/FFI#Impure_C_Functions, if you declare an FFI function as pure (so, with no reference to IO), then
GHC sees no point in calculating twice the result of a pure function
which means the the result of the function call is effectively cached. For example, a program where a foreign impure pseudo-random number generator is declared to return a CUInt
{-# LANGUAGE ForeignFunctionInterface #-} import Foreign import Foreign.C.Types foreign import ccall unsafe "stdlib.h rand" c_rand :: CUInt main = putStrLn (show c_rand) >> putStrLn (show c_rand)
returns the same thing every call, at least on my compiler/system:
16807 16807
If we change the declaration to return a IO CUInt
{-# LANGUAGE ForeignFunctionInterface #-} import Foreign import Foreign.C.Types foreign import ccall unsafe "stdlib.h rand" c_rand :: IO CUInt main = c_rand >>= putStrLn . show >> c_rand >>= putStrLn . show
then this results in (probably) a different number returned each call, since the compiler knows it's impure:
16807 282475249
So you're back to having to use IO for the calls to the standard libraries anyway.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With