I am working on understanding the State
monad and have written two simple versions of the famous fibonacci to memoize the function. The one with let
in the body runs very slowly. The one with <-
runs very fast. My question: Why? Is let causing full evaluation somehow while <-
is allowing the M.lookup
via Data.Map
to work?
-- using state monad and let -- very slow fibLet :: Int -> State (M.Map Int Integer) Integer fibLet n = do case n of 0 -> return 0 1 -> return 1 n -> do mp <- get if M.member n mp then return $ fromJust (M.lookup n mp) else do let s1 = evalState (fibLet (n - 1)) mp let s2 = evalState (fibLet (n - 2)) mp let s3 = s1+s2 modify $ M.insert n s3 return s3 fibonacciLet :: Int -> Integer fibonacciLet n = evalState (fibLet n) M.empty
-- using state monad and <- -- very fast fibArrow :: Int -> State (M.Map Int Integer) Integer fibArrow n = do case n of 0 -> return 0 1 -> return 1 n -> do mp <- get if M.member n mp then return $ fromJust (M.lookup n mp) else do s1 <-fibArrow (n - 1) s2 <-fibArrow (n - 2) let s3 = s1+s2 modify $ M.insert n s3 return s3 fibonacciArrow :: Int -> Integer fibonacciArrow n = evalState (fibArrow n) M.empty
fibLet
isn't actually using its state very much at all! That's why it's so slow; it's essentially a very complicated way to write the classic Haskell fibonacci definition:
fib 0 = 0 fib 1 = 1 fib n = fib (n - 1) + fib (n - 2)
Look carefully what's going on here:
-- from earlier: mp <- get do let s1 = evalState (fibLet (n - 1)) mp let s2 = evalState (fibLet (n - 2)) mp let s3 = s1+s2 modify $ M.insert n s3 return s3
mp
is the Map
of currently-known results. You use evalState
to run fibLet (n - 1)
, starting its state with mp
. Then you use evalState
again to run fibLet (n - 2)
, starting its state with mp
again. This means fibLet (n - 1)
and fibLet (n - 2)
aren't sharing any work; each of them is using the same starting map of already known results mp
, so anything that isn't already in that map is going to have to be computed by both branches.
However it's actually even worse than that. Look at the type of evalState
:
evalState :: State s a -> s -> a
It has no State
in its return type. That means it isn't actually stateful. It does not interact with a any surrounding state; effectively it starts off a new state thread, runs it to completion1, after which the state is discarded.
You can tell that really easily by looking at the type of evalState
slightly differently (but equivalently):
evalState :: State s a -> (s -> a)
evalState
turns a State s a
into a simple function typed s -> a
. Obviously a plain old function can't alter the implicit state of a do
block its called from (that's the whole point of making statefulness explicit in the types). So that means:
evalState (fibLet (n - 1)) :: M.Map Int Integer -> Integer
evalState (fibLet (n - 1))
is just a plain old function from maps to integers. It neither accesses nor affects state of any kind.
So that means that after let s1 = evalState (fibLet (n - 1)) mp
the state in the outer do
block is still exactly equal to mp
. Nothing has been saved in our state map of already computed results. So not only are we not sharing any work between the two separate recursive calls, they're not even sharing work within the deeper layers of recursion inside each call!
To prove it, try running fibLet
using runState
instead of evalState
, so you can see what the final map is:
ghci> runState (fibLet 20) M.empty (6765,fromList [(20,6765)]) it :: (Integer, M.Map Int Integer)
There's only a single entry in the map, added as the last step of the do
block in the outermost call to fibLet
, just before it returned.
If you do the same thing with fibArrow
you get this:
ghci> runState (fibArrow 20) M.empty (6765,fromList [(2,1),(3,2),(4,3),(5,5),(6,8),(7,13),(8,21),(9,34),(10,55),(11,89),(12,144),(13,233),(14,377),(15,610),(16,987),(17,1597),(18,2584),(19,4181),(20,6765)]) it :: (Integer, M.Map Int Integer)
You can clearly see it's memoized all the intermediate results, not just the final one.
Bottom line: You don't usually want to use evalState
(and similar functions like runState
and execState
) inside a function that's operating in the State
monad. Those functions are intended for running the whole stateful computation, and so are normally run from "outside" a state monad context. When you happen to run them "inside" a state monad context, they do not interact with it; they instead receive (and return, in the case of runState
and execState
) state through ordinary argument passing and function return, not through the implicit state threading provided by State
.
If you want to make a call to a "stateful sub-computation" (i.e. call a State _ _
inside a function returning State _ _
), then you need to bind the sub-computation as part of the outer one. The arrow statement inside do
blocks does this; the let
statement doesn't, it only gives a name to an ordinary expression. That's why you found yourself having to use evalState
to get a result and explicitly feed mp
even though you're already inside a stateful context where it should be available implicitly; the messiness of that was a hint that something wasn't right.
1 Well, it "runs it to completion" assuming that the result value is fully demanded.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With