Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is proper monad or sequence comprehension to both map and carry state across?

I'm writing a programming language interpreter.

I have need of the right code idiom to both evaluate a sequence of expressions to get a sequence of their values, and propagate state from one evaluator to the next to the next as the evaluations take place. I'd like a functional programming idiom for this.

It's not a fold because the results come out like a map. It's not a map because of the state prop across.

What I have is this code which I'm using to try to figure this out. Bear with a few lines of test rig first:

// test rig
class MonadLearning extends JUnit3Suite {

  val d = List("1", "2", "3") // some expressions to evaluate. 

  type ResType = Int 
  case class State(i : ResType) // trivial state for experiment purposes
  val initialState = State(0)

// my stub/dummy "eval" function...obviously the real one will be...real.
  def computeResultAndNewState(s : String, st : State) : (ResType, State) = {
    val State(i) = st
    val res = s.toInt + i
    val newStateInt = i + 1
    (res, State(newStateInt))
  }

My current solution. Uses a var which is updated as the body of the map is evaluated:

  def testTheVarWay() {
    var state = initialState
    val r = d.map {
      s =>
        {
          val (result, newState) = computeResultAndNewState(s, state)
          state = newState
          result
        }
    }
    println(r)
    println(state)
  }

I have what I consider unacceptable solutions using foldLeft which does what I call "bag it as you fold" idiom:

def testTheFoldWay() {

// This startFold thing, requires explicit type. That alone makes it muddy.
val startFold : (List[ResType], State) = (Nil, initialState)
val (r, state) = d.foldLeft(startFold) {
  case ((tail, st), s) => {
    val (r, ns) = computeResultAndNewState(s, st)
    (tail :+ r, ns) // we want a constant-time append here, not O(N). Or could Cons on front and reverse later
  }
}

println(r)
println(state)

}

I also have a couple of recursive variations (which are obvious, but also not clear or well motivated), one using streams which is almost tolerable:

def testTheStreamsWay() {
  lazy val states = initialState #:: resultStates // there are states
  lazy val args = d.toStream // there are arguments
  lazy val argPairs = args zip states // put them together
  lazy val resPairs : Stream[(ResType, State)] = argPairs.map{ case (d1, s1) => computeResultAndNewState(d1, s1) } // map across them
  lazy val (results , resultStates) = myUnzip(resPairs)// Note .unzip causes infinite loop. Had to write my own.

  lazy val r = results.toList
  lazy val finalState = resultStates.last

  println(r)
  println(finalState)
}

But, I can't figure out anything as compact or clear as the original 'var' solution above, which I'm willing to live with, but I think somebody who eats/drinks/sleeps monad idioms is going to just say ... use this... (Hopefully!)

like image 676
Mike Beckerle Avatar asked Sep 04 '12 10:09

Mike Beckerle


People also ask

Is map a monad?

Map is not one of the defining properties of monads, however, because it's technically just a special case of FlatMap. A lifting function like Unit will wrap its object in a container, even if that object is itself the same type of container.

What is monad in programming?

What is a Monad? A monad is an algebraic structure in category theory, and in Haskell it is used to describe computations as sequences of steps, and to handle side effects such as state and IO. Monads are abstract, and they have many useful concrete instances. Monads provide a way to structure a program.

What does a monad in Scala mean?

In Scala, Monads is a construction which performs successive calculations. It is an object which covers the other object. It is worth noting that here, the output of an operation at some step is an input to another computations, which is a parent to the recent step of the program stated.


2 Answers

With the map-with-accumulator combinator (the easy way)

The higher-order function you want is mapAccumL. It's in Haskell's standard library, but for Scala you'll have to use something like Scalaz.

First the imports (note that I'm using Scalaz 7 here; for previous versions you'd import Scalaz._):

import scalaz._, syntax.std.list._

And then it's a one-liner:

scala> d.mapAccumLeft(initialState, computeResultAndNewState)
res1: (State, List[ResType]) = (State(3),List(1, 3, 5))

Note that I've had to reverse the order of your evaluator's arguments and the return value tuple to match the signatures expected by mapAccumLeft (state first in both cases).

With the state monad (the slightly less easy way)

As Petr Pudlák points out in another answer, you can also use the state monad to solve this problem. Scalaz actually provides a number of facilities that make working with the state monad much easier than the version in his answer suggests, and they won't fit in a comment, so I'm adding them here.

First of all, Scalaz does provide a mapM—it's just called traverse (which is a little more general, as Petr Pudlák notes in his comment). So assuming we've got the following (I'm using Scalaz 7 again here):

import scalaz._, Scalaz._

type ResType = Int
case class Container(i: ResType)

val initial = Container(0)
val d = List("1", "2", "3")

def compute(s: String): State[Container, ResType] = State {
  case Container(i) => (Container(i + 1), s.toInt + i)
}

We can write this:

d.traverse[({type L[X] = State[Container, X]})#L, ResType](compute).run(initial)

If you don't like the ugly type lambda, you can get rid of it like this:

type ContainerState[X] = State[Container, X]

d.traverse[ContainerState, ResType](compute).run(initial)

But it gets even better! Scalaz 7 gives you a version of traverse that's specialized for the state monad:

scala> d.traverseS(compute).run(initial)
res2: (Container, List[ResType]) = (Container(3),List(1, 3, 5))

And as if that wasn't enough, there's even a version with the run built in:

scala> d.runTraverseS(initial)(compute)
res3: (Container, List[ResType]) = (Container(3),List(1, 3, 5))

Still not as nice as the mapAccumLeft version, in my opinion, but pretty clean.

like image 68
Travis Brown Avatar answered Oct 21 '22 10:10

Travis Brown


What you're describing is a computation within the state monad. I believe that the answer to your question

It's not a fold because the results come out like a map. It's not a map because of the state prop across.

is that it's a monadic map using the state monad.

Values of the state monad are computations that read some internal state, possibly modify it, and return some value. It is often used in Haskell (see here or here).

For Scala, there is a trait in the ScalaZ library called State that models it (see also the source). There are utility methods in States for creating instances of State. Note that from the monadic point of view State is just a monadic value. This may seem confusing at first, because it's described by a function depending on a state. (A monadic function would be something of type A => State[B].)

Next you need is a monadic map function that computes values of your expressions, threading the state through the computations. In Haskell, there is a library method mapM that does just that, when specialized to the state monad.

In Scala, there is no such library function (if it is, please correct me). But it's possible to create one. To give a full example:

import scalaz._;

object StateExample
  extends App
  with States /* utility methods */
{
  // The context that is threaded through the state.
  // In our case, it just maps variables to integer values.
  class Context(val map: Map[String,Int]);

  // An example that returns the requested variable's value
  // and increases it's value in the context.
  def eval(expression: String): State[Context,Int] =
    state((ctx: Context) => {
      val v = ctx.map.get(expression).getOrElse(0);
      (new Context(ctx.map + ((expression, v + 1)) ), v);
    });

  // Specialization of Haskell's mapM to our State monad.
  def mapState[S,A,B](f: A => State[S,B])(xs: Seq[A]): State[S,Seq[B]] =
    state((initState: S) => {
      var s = initState;
      // process the sequence, threading the state
      // through the computation
      val ys = for(x <- xs) yield { val r = f(x)(s); s = r._1; r._2 };
      // return the final state and the output result
      (s, ys);
    });


  // Example: Try to evaluate some variables, starting from an empty context.
  val expressions = Seq("x", "y", "y", "x", "z", "x");

  print( mapState(eval)(expressions) ! new Context(Map[String,Int]()) );
}

This way you can create simple functions that take some arguments and return State and then combine them into more complex ones by using State.map or State.flatMap (or perhaps better using for comprehensions), and then you can run the whole computation on a list of expressions by mapM.


See also http://blog.tmorris.net/posts/the-state-monad-for-scala-users/


Edit: See Travis Brown's answer, he described how to use the state monad in Scala much more nicely.

He also asks:

But why, when there's a standard combinator that does exactly what you need in this case? (I ask this as someone who's been slapped for using the state monad when mapAccumL would do.)

It's because the original question asked:

It's not a fold because the results come out like a map. It's not a map because of the state prop across.

and I believe the proper answer is it is a monadic map using the state monad.

Using mapAccumL is surely faster, both less memory and CPU overhead. But the state monad captures the concept of what is going on, the essence of the problem. I believe in many (if not most) cases this is more important. Once we realize the essence of the problem, we can either use the high-level concepts to nicely describe the solution (perhaps sacrificing speed/memory a little) or optimize it to be fast (or perhaps even manage to do both).

On the other hand, mapAccumL solves this particular problem, but doesn't give us a broader answer. If we need to modify it a little, it might happen it won't work any more. Or, if the library starts to be complex, the code can start to be messy and we won't know how to improve it, how to make the original idea clear again.

For example, in the case of evaluating stateful expressions, the library can become complicated and complex. But if we use the state monad, we can build the library around small functions, each taking some arguments and returning something like State[Context,Result]. These atomic computations can be combined to more complex ones using flatMap method or for comprehensions, and finally we'll construct the desired task. The principle will stay the same across the whole library, and the final task will also be something that returns State[Context,Result].

To conclude: I'm not saying using the state monad is the best solution, and certainly it's not the fastest one. I just believe it is most didactic for a functional programmer - it describes the problem in a clean, abstract way.

like image 40
Petr Avatar answered Oct 21 '22 09:10

Petr