Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Help me understand this Scala code: scalaz IO Monad and implicits

This is a followup to this question.

Here's the code I'm trying to understand (it's from http://apocalisp.wordpress.com/2010/10/17/scalaz-tutorial-enumeration-based-io-with-iteratees/):

object io {
  sealed trait IO[A] {
    def unsafePerformIO: A
  }

  object IO {
    def apply[A](a: => A): IO[A] = new IO[A] {
      def unsafePerformIO = a
    }
  }

  implicit val IOMonad = new Monad[IO] {
    def pure[A](a: => A): IO[A] = IO(a)
    def bind[A,B](a: IO[A], f: A => IO[B]): IO[B] = IO {
      implicitly[Monad[Function0]].bind(() => a.unsafePerformIO,
                                        (x:A) => () => f(x).unsafePerformIO)()
    }
  }
}

This code is used like this (I'm assuming an import io._ is implied)

def bufferFile(f: File) = IO {   new BufferedReader(new FileReader(f)) }

def closeReader(r: Reader) = IO {   r.close }

def bracket[A,B,C](init: IO[A], fin: A => IO[B], body: A => IO[C]): IO[C] = for { a <- init
      c <- body(a)
      _ <- fin(a) }   yield c

def enumFile[A](f: File, i: IterV[String, A]): IO[IterV[String, A]] =  bracket(bufferFile(f),
          closeReader(_:BufferedReader),
          enumReader(_:BufferedReader, i))

I'm now trying to understand the implicit val IOMonad definition. Here's how I understand it. This is a scalaz.Monad, so it needs to define pure and bind abstract values of the scalaz.Monad trait.

pure takes a value and turns it into a value contained in the "container" type. For example it could take an Int and return a List[Int]. This seems pretty simple.

bind takes a "container" type and a function that maps the type that the container holds to another type. The value that is returned is the same container type, but it's now holding a new type. An example would be taking a List[Int] and mapping it to a List[String] using a function that maps Ints to Strings. Is bind pretty much the same as map?

The implementation of bind is where I'm stuck. Here's the code:

def bind[A,B](a: IO[A], f: A => IO[B]): IO[B] = IO {
  implicitly[Monad[Function0]].bind(() => a.unsafePerformIO,
      (x:A) => () => f(x).unsafePerformIO)()
}

This definition takes IO[A] and maps it to IO[B] using a function that takes an A and returns an IO[B]. I guess to do this, it has to use flatMap to "flatten" the result (correct?).

The = IO { ... } is the same as

 = new IO[A] {
  def unsafePerformIO = implicitly[Monad[Function0]].bind(() => a.unsafePerformIO,
      (x:A) => () => f(x).unsafePerformIO)()
  }
}

I think?

the implicitly method looks for an implicit value (value, right?) that implements Monad[Function0]. Where does this implicit definition come from? I'm guessing this is from the implicit val IOMonad = new Monad[IO] {...} definition, but we're inside that definition right now and things get a little circular and my brain starts to get stuck in an infinite loop :)

Also, the first argument to bind (() => a.unsafePerformIO) seems to be a function that takes no parameters and returns a.unsafePerformIO. How should I read this? bind takes a container type as its first argument, so maybe () => a.unsafePerformIO resolves to a container type?

like image 666
three-cups Avatar asked Sep 14 '11 15:09

three-cups


1 Answers

IO[A] is intended to represent an Action returning an A, where the result of the Action may depend on the environment (meaning anything, values of variables, file system, system time...) and the execution of the action may also modify the environment. Actually, scala type for an Action would be Function0. Function0[A] returns an A when called and it is certainly allowed to depend on and modify the environment. IO is Function0 under another name, but it is intended to distinguish (tag?) those Function0 which depends on the environment from the other ones, which are actually pure value (if you say f is a function[A] which always returns the same value, without any side effect, there is no much difference between f and its result). To be precise, it is not so much that function tagged as IO must have side effect. It is that those not so tagged must have none. Note however than wrapping impure functions in IO is entirely voluntary, there is no way you will have a guarantee when you get a Function0 that it is pure. Using IO is certainly not the dominant style in scala.

pure takes a value and turns it into a value contained in the "container" type.

Quite right, but "container" may mean quite a lot of things. And the one returned by pure must be as light as possible, it must be the one that makes no difference. The point of list is that they may have any number of values. The one returned by pure must have one. The point of IO is that it depends on and affect the environment. The one returned by pure must do no such thing. So it is actually the pure Function0 () => a, wrapped in IO.

bind pretty much the same as map

Not so, bind is the same as flatMap. As you write, map would receive a function from Int to String, but here you have the function from Int to List[String]

Now, forget IO for a moment and consider what bind/flatMap would mean for an Action, that is for Function0. Let's have

val askUserForLineNumber: () => Int = {...}
val readingLineAt: Int => Function0[String] = {i: Int  => () => ...}

Now if we must combine, as bind/flatMap does, those items to get an action that returns a String, what it must be is pretty clear: ask the reader for the line number, read that line and returns it. That would be

val askForLineNumberAndReadIt= () => {
  val lineNumber : Int = askUserForLineNumber()
  val readingRequiredLine: Function0[String] = readingLineAt(line)
  val lineContent= readingRequiredLine()
  lineContent
}

More generically

def bind[A,B](a: Function0[A], f: A => Function0[B]) = () => {
  val value = a()
  val nextAction = f(value)
  val result = nextAction()
  result
}

And shorter:

def bind[A,B](a: Function0[A], f: A => Function0[B]) 
  = () => {f(a())()}

So we know what bind must be for Function0, pure is clear too. We can do

object ActionMonad extends Monad[Function0] {
  def pure[A](a: => A) = () => a
  def bind[A,B](a: () => A, f: A => Function0[B]) = () => f(a())()
}

Now, IO is Function0 in disguise. Instead of just doing a(), we must do a.unsafePerformIO. And to define one, instead of () => body, we write IO {body} So there could be

object IOMonad extends Monad[IO] {
  def pure[A](a: => A) = IO {a}
  def bind[A,B](a: IO[A], f: A => IO[B]) = IO {f(a.unsafePerformIO).unsafePerformIO}
}

In my view, that would be good enough. But in fact it repeats the ActionMonad. The point in the code you refer to is to avoid that and reuse what is done for Function0 instead. One goes easily from IO to Function0 (with () => io.unsafePerformIo) as well as from Function0 to IO (with IO { action() }). If you have f: A => IO[B], you can also change that to f: A => Function0[B], just by composing with the IO to Function0 transform, so (x: A) => f(x).unsafePerformIO.

What happens here in the bind of IO is:

  1. () => a.unsafePerformIO: turn a into a Function0
  2. (x:A) => () => f(x).unsafePerformIO): turn f into an A => Function0[B]
  3. implicitly[Monad[Function0]]: get the default monad for Function0, the very same as the ActionMonad above
  4. bind(...): apply the bind of the Function0 monad to the arguments a and f that have just been converted to Function0
  5. The enclosing IO{...}: convert the result back to IO.

(Not sure I like it much)

like image 165
Didier Dupont Avatar answered Sep 30 '22 13:09

Didier Dupont