In Leibniz's system of metaphysics, monads are basic substances that make up the universe but lack spatial extension and hence are immaterial. Each monad is a unique, indestructible, dynamic, soullike entity whose properties are a function of its perceptions and appetites.
A monad is an algebraic structure in category theory, and in Haskell it is used to describe computations as sequences of steps, and to handle side effects such as state and IO. Monads are abstract, and they have many useful concrete instances. Monads provide a way to structure a program.
monads are used to address the more general problem of computations (involving state, input/output, backtracking, ...) returning values: they do not solve any input/output-problems directly but rather provide an elegant and flexible abstraction of many solutions to related problems.
So in simple words, a monad is a rule to pass from any type X to another type T(X) , and a rule to pass from two functions f:X->T(Y) and g:Y->T(Z) (that you would like to compose but can't) to a new function h:X->T(Z) . Which, however, is not the composition in strict mathematical sense.
First: The term monad is a bit vacuous if you are not a mathematician. An alternative term is computation builder which is a bit more descriptive of what they are actually useful for.
They are a pattern for chaining operations. It looks a bit like method chaining in object-oriented languages, but the mechanism is slightly different.
The pattern is mostly used in functional languages (especially Haskell which uses monads pervasively) but can be used in any language which support higher-order functions (that is, functions which can take other functions as arguments).
Arrays in JavaScript support the pattern, so let’s use that as the first example.
The gist of the pattern is we have a type (Array
in this case) which has a method which takes a function as argument. The operation supplied must return an instance of the same type (i.e. return an Array
).
First an example of method chaining which does not use the monad pattern:
[1,2,3].map(x => x + 1)
The result is [2,3,4]
. The code does not conform to the monad pattern, since the function we are supplying as an argument returns a number, not an Array. The same logic in monad form would be:
[1,2,3].flatMap(x => [x + 1])
Here we supply an operation which returns an Array
, so now it conforms to the pattern. The flatMap
method executes the provided function for every element in the array. It expects an array as result for each invocation (rather than single values), but merges the resulting set of arrays into a single array. So the end result is the same, the array [2,3,4]
.
(The function argument provided to a method like map
or flatMap
is often called a "callback" in JavaScript. I will call it the "operation" since it is more general.)
If we chain multiple operations (in the traditional way):
[1,2,3].map(a => a + 1).filter(b => b != 3)
Results in the array [2,4]
The same chaining in monad form:
[1,2,3].flatMap(a => [a + 1]).flatMap(b => b != 3 ? [b] : [])
Yields the same result, the array [2,4]
.
You will immediately notice that the monad form is quite a bit uglier than the non-monad! This just goes to show that monads are not necessarily “good”. They are a pattern which is sometimes beneficial and sometimes not.
Do note that the monad pattern can be combined in a different way:
[1,2,3].flatMap(a => [a + 1].flatMap(b => b != 3 ? [b] : []))
Here the binding is nested rather than chained, but the result is the same. This is an important property of monads as we will see later. It means two operations combined can be treated the same as a single operation.
The operation is allowed to return an array with different element types, for example transforming an array of numbers into an array of strings or something else; as long as it still an Array.
This can be described a bit more formally using Typescript notation. An array has the type Array<T>
, where T
is the type of the elements in the array. The method flatMap()
takes a function argument of the type T => Array<U>
and returns an Array<U>
.
Generalized, a monad is any type Foo<Bar>
which has a "bind" method which takes a function argument of type Bar => Foo<Baz>
and returns a Foo<Baz>
.
This answers what monads are. The rest of this answer will try to explain through examples why monads can be a useful pattern in a language like Haskell which has good support for them.
Haskell and Do-notation
To translate the map/filter example directly to Haskell, we replace flatMap
with the >>=
operator:
[1,2,3] >>= \a -> [a+1] >>= \b -> if b == 3 then [] else [b]
The >>=
operator is the bind function in Haskell. It does the same as flatMap
in JavaScript when the operand is a list, but it is overloaded with different meaning for other types.
But Haskell also has a dedicated syntax for monad expressions, the do
-block, which hides the bind operator altogether:
do a <- [1,2,3]
b <- [a+1]
if b == 3 then [] else [b]
This hides the "plumbing" and lets you focus on the actual operations applied at each step.
In a do
-block, each line is an operation. The constraint still holds that all operations in the block must return the same type. Since the first expression is a list, the other operations must also return a list.
The back-arrow <-
looks deceptively like an assignment, but note that this is the parameter passed in the bind. So, when the expression on the right side is a List of Integers, the variable on the left side will be a single Integer – but will be executed for each integer in the list.
Example: Safe navigation (the Maybe type)
Enough about lists, lets see how the monad pattern can be useful for other types.
Some functions may not always return a valid value. In Haskell this is represented by the Maybe
-type, which is an option that is either Just value
or Nothing
.
Chaining operations which always return a valid value is of course straightforward:
streetName = getStreetName (getAddress (getUser 17))
But what if any of the functions could return Nothing
? We need to check each result individually and only pass the value to the next function if it is not Nothing
:
case getUser 17 of
Nothing -> Nothing
Just user ->
case getAddress user of
Nothing -> Nothing
Just address ->
getStreetName address
Quite a lot of repetitive checks! Imagine if the chain was longer. Haskell solves this with the monad pattern for Maybe
:
do
user <- getUser 17
addr <- getAddress user
getStreetName addr
This do
-block invokes the bind-function for the Maybe
type (since the result of the first expression is a Maybe
). The bind-function only executes the following operation if the value is Just value
, otherwise it just passes the Nothing
along.
Here the monad-pattern is used to avoid repetitive code. This is similar to how some other languages use macros to simplify syntax, although macros achieve the same goal in a very different way.
Note that it is the combination of the monad pattern and the monad-friendly syntax in Haskell which result in the cleaner code. In a language like JavaScript without any special syntax support for monads, I doubt the monad pattern would be able to simplify the code in this case.
Mutable state
Haskell does not support mutable state. All variables are constants and all values immutable. But the State
type can be used to emulate programming with mutable state:
add2 :: State Integer Integer
add2 = do
-- add 1 to state
x <- get
put (x + 1)
-- increment in another way
modify (+1)
-- return state
get
evalState add2 7
=> 9
The add2
function builds a monad chain which is then evaluated with 7 as the initial state.
Obviously this is something which only makes sense in Haskell. Other languages support mutable state out of the box. Haskell is generally "opt-in" on language features - you enable mutable state when you need it, and the type system ensures the effect is explicit. IO is another example of this.
IO
The IO
type is used for chaining and executing “impure” functions.
Like any other practical language, Haskell has a bunch of built-in functions which interface with the outside world: putStrLine
, readLine
and so on. These functions are called “impure” because they either cause side effects or have non-deterministic results. Even something simple like getting the time is considered impure because the result is non-deterministic – calling it twice with the same arguments may return different values.
A pure function is deterministic – its result depends purely on the arguments passed and it has no side effects on the environment beside returning a value.
Haskell heavily encourages the use of pure functions – this is a major selling point of the language. Unfortunately for purists, you need some impure functions to do anything useful. The Haskell compromise is to cleanly separate pure and impure, and guarantee that there is no way that pure functions can execute impure functions, directly or indirect.
This is guaranteed by giving all impure functions the IO
type. The entry point in Haskell program is the main
function which have the IO
type, so we can execute impure functions at the top level.
But how does the language prevent pure functions from executing impure functions? This is due to the lazy nature of Haskell. A function is only executed if its output is consumed by some other function. But there is no way to consume an IO
value except to assign it to main
. So if a function wants to execute an impure function, it has to be connected to main
and have the IO
type.
Using monad chaining for IO operations also ensures that they are executed in a linear and predictable order, just like statements in an imperative language.
This brings us to the first program most people will write in Haskell:
main :: IO ()
main = do
putStrLn ”Hello World”
The do
keyword is superfluous when there is only a single operation and therefore nothing to bind, but I keep it anyway for consistency.
The ()
type means “void”. This special return type is only useful for IO functions called for their side effect.
A longer example:
main = do
putStrLn "What is your name?"
name <- getLine
putStrLn "hello" ++ name
This builds a chain of IO
operations, and since they are assigned to the main
function, they get executed.
Comparing IO
with Maybe
shows the versatility of the monad pattern. For Maybe
, the pattern is used to avoid repetitive code by moving conditional logic to the binding function. For IO
, the pattern is used to ensure that all operations of the IO
type are sequenced and that IO
operations cannot "leak" to pure functions.
Summing up
In my subjective opinion, the monad pattern is only really worthwhile in a language which has some built-in support for the pattern. Otherwise it just leads to overly convoluted code. But Haskell (and some other languages) have some built-in support which hides the tedious parts, and then the pattern can be used for a variety of useful things. Like:
Maybe
)IO
)Parser
)Explaining "what is a monad" is a bit like saying "what is a number?" We use numbers all the time. But imagine you met someone who didn't know anything about numbers. How the heck would you explain what numbers are? And how would you even begin to describe why that might be useful?
What is a monad? The short answer: It's a specific way of chaining operations together.
In essence, you're writing execution steps and linking them together with the "bind function". (In Haskell, it's named >>=
.) You can write the calls to the bind operator yourself, or you can use syntax sugar which makes the compiler insert those function calls for you. But either way, each step is separated by a call to this bind function.
So the bind function is like a semicolon; it separates the steps in a process. The bind function's job is to take the output from the previous step, and feed it into the next step.
That doesn't sound too hard, right? But there is more than one kind of monad. Why? How?
Well, the bind function can just take the result from one step, and feed it to the next step. But if that's "all" the monad does... that actually isn't very useful. And that's important to understand: Every useful monad does something else in addition to just being a monad. Every useful monad has a "special power", which makes it unique.
(A monad that does nothing special is called the "identity monad". Rather like the identity function, this sounds like an utterly pointless thing, yet turns out not to be... But that's another story™.)
Basically, each monad has its own implementation of the bind function. And you can write a bind function such that it does hoopy things between execution steps. For example:
If each step returns a success/failure indicator, you can have bind execute the next step only if the previous one succeeded. In this way, a failing step aborts the whole sequence "automatically", without any conditional testing from you. (The Failure Monad.)
Extending this idea, you can implement "exceptions". (The Error Monad or Exception Monad.) Because you're defining them yourself rather than it being a language feature, you can define how they work. (E.g., maybe you want to ignore the first two exceptions and only abort when a third exception is thrown.)
You can make each step return multiple results, and have the bind function loop over them, feeding each one into the next step for you. In this way, you don't have to keep writing loops all over the place when dealing with multiple results. The bind function "automatically" does all that for you. (The List Monad.)
As well as passing a "result" from one step to another, you can have the bind function pass extra data around as well. This data now doesn't show up in your source code, but you can still access it from anywhere, without having to manually pass it to every function. (The Reader Monad.)
You can make it so that the "extra data" can be replaced. This allows you to simulate destructive updates, without actually doing destructive updates. (The State Monad and its cousin the Writer Monad.)
Because you're only simulating destructive updates, you can trivially do things that would be impossible with real destructive updates. For example, you can undo the last update, or revert to an older version.
You can make a monad where calculations can be paused, so you can pause your program, go in and tinker with internal state data, and then resume it.
You can implement "continuations" as a monad. This allows you to break people's minds!
All of this and more is possible with monads. Of course, all of this is also perfectly possible without monads too. It's just drastically easier using monads.
Actually, contrary to common understanding of Monads, they have nothing to do with state. Monads are simply a way to wrapping things and provide methods to do operations on the wrapped stuff without unwrapping it.
For example, you can create a type to wrap another one, in Haskell:
data Wrapped a = Wrap a
To wrap stuff we define
return :: a -> Wrapped a
return x = Wrap x
To perform operations without unwrapping, say you have a function f :: a -> b
, then you can do this to lift that function to act on wrapped values:
fmap :: (a -> b) -> (Wrapped a -> Wrapped b)
fmap f (Wrap x) = Wrap (f x)
That's about all there is to understand. However, it turns out that there is a more general function to do this lifting, which is bind
:
bind :: (a -> Wrapped b) -> (Wrapped a -> Wrapped b)
bind f (Wrap x) = f x
bind
can do a bit more than fmap
, but not vice versa. Actually, fmap
can be defined only in terms of bind
and return
. So, when defining a monad.. you give its type (here it was Wrapped a
) and then say how its return
and bind
operations work.
The cool thing is that this turns out to be such a general pattern that it pops up all over the place, encapsulating state in a pure way is only one of them.
For a good article on how monads can be used to introduce functional dependencies and thus control order of evaluation, like it is used in Haskell's IO monad, check out IO Inside.
As for understanding monads, don't worry too much about it. Read about them what you find interesting and don't worry if you don't understand right away. Then just diving in a language like Haskell is the way to go. Monads are one of these things where understanding trickles into your brain by practice, one day you just suddenly realize you understand them.
But, You could have invented Monads!
sigfpe says:
But all of these introduce monads as something esoteric in need of explanation. But what I want to argue is that they aren't esoteric at all. In fact, faced with various problems in functional programming you would have been led, inexorably, to certain solutions, all of which are examples of monads. In fact, I hope to get you to invent them now if you haven't already. It's then a small step to notice that all of these solutions are in fact the same solution in disguise. And after reading this, you might be in a better position to understand other documents on monads because you'll recognise everything you see as something you've already invented.
Many of the problems that monads try to solve are related to the issue of side effects. So we'll start with them. (Note that monads let you do more than handle side-effects, in particular many types of container object can be viewed as monads. Some of the introductions to monads find it hard to reconcile these two different uses of monads and concentrate on just one or the other.)
In an imperative programming language such as C++, functions behave nothing like the functions of mathematics. For example, suppose we have a C++ function that takes a single floating point argument and returns a floating point result. Superficially it might seem a little like a mathematical function mapping reals to reals, but a C++ function can do more than just return a number that depends on its arguments. It can read and write the values of global variables as well as writing output to the screen and receiving input from the user. In a pure functional language, however, a function can only read what is supplied to it in its arguments and the only way it can have an effect on the world is through the values it returns.
A monad is a datatype that has two operations: >>=
(aka bind
) and return
(aka unit
). return
takes an arbitrary value and creates an instance of the monad with it. >>=
takes an instance of the monad and maps a function over it. (You can see already that a monad is a strange kind of datatype, since in most programming languages you couldn't write a function that takes an arbitrary value and creates a type from it. Monads use a kind of parametric polymorphism.)
In Haskell notation, the monad interface is written
class Monad m where
return :: a -> m a
(>>=) :: forall a b . m a -> (a -> m b) -> m b
These operations are supposed to obey certain "laws", but that's not terrifically important: the "laws" just codify the way sensible implementations of the operations ought to behave (basically, that >>=
and return
ought to agree about how values get transformed into monad instances and that >>=
is associative).
Monads are not just about state and I/O: they abstract a common pattern of computation that includes working with state, I/O, exceptions, and non-determinism. Probably the simplest monads to understand are lists and option types:
instance Monad [ ] where
[] >>= k = []
(x:xs) >>= k = k x ++ (xs >>= k)
return x = [x]
instance Monad Maybe where
Just x >>= k = k x
Nothing >>= k = Nothing
return x = Just x
where []
and :
are the list constructors, ++
is the concatenation operator, and Just
and Nothing
are the Maybe
constructors. Both of these monads encapsulate common and useful patterns of computation on their respective data types (note that neither has anything to do with side effects or I/O).
You really have to play around writing some non-trivial Haskell code to appreciate what monads are about and why they are useful.
You should first understand what a functor is. Before that, understand higher-order functions.
A higher-order function is simply a function that takes a function as an argument.
A functor is any type construction T
for which there exists a higher-order function, call it map
, that transforms a function of type a -> b
(given any two types a
and b
) into a function T a -> T b
. This map
function must also obey the laws of identity and composition such that the following expressions return true for all p
and q
(Haskell notation):
map id = id
map (p . q) = map p . map q
For example, a type constructor called List
is a functor if it comes equipped with a function of type (a -> b) -> List a -> List b
which obeys the laws above. The only practical implementation is obvious. The resulting List a -> List b
function iterates over the given list, calling the (a -> b)
function for each element, and returns the list of the results.
A monad is essentially just a functor T
with two extra methods, join
, of type T (T a) -> T a
, and unit
(sometimes called return
, fork
, or pure
) of type a -> T a
. For lists in Haskell:
join :: [[a]] -> [a]
pure :: a -> [a]
Why is that useful? Because you could, for example, map
over a list with a function that returns a list. Join
takes the resulting list of lists and concatenates them. List
is a monad because this is possible.
You can write a function that does map
, then join
. This function is called bind
, or flatMap
, or (>>=)
, or (=<<)
. This is normally how a monad instance is given in Haskell.
A monad has to satisfy certain laws, namely that join
must be associative. This means that if you have a value x
of type [[[a]]]
then join (join x)
should equal join (map join x)
. And pure
must be an identity for join
such that join (pure x) == x
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With