I'm currently designing a numerical algorithm which as part of its operations requires updating a vector of doubles
many times. Due to the fact that the algorithm has to be as space and time efficient as possible, I do not want to code the traditional type of FP code which creates many versions of the data structure under the hood after each operation on it. Neither do I want to create mutable data structures and have them globally available. Consequently, I have decided to use a mutable data structure but then choose to do the the mutable operations in a State
monad. Since this is my first stab at using a State
monad, I want to confirm the whether I have or have not
The update
function transitions the data structure state. Since the destructive update is localised within this function and a handle to the data structure cannot be got at from outside, I think this function is pure and referentially transparent.
def update(i : Int,d : Double) = State[ArraySeq[Double], Unit]{
case xs: ArraySeq[Double] => {xs(i) = d; (xs, ())}
}
The app
function is a toy function which will consume a sequence of double
s and modify it's state:
def app : State[ArraySeq[Double], Unit] = for{
_ <- update(0, 3.142)
// do a heap of stuff on ArraySeq
}yield()
Call:
app(Vector(0.0, 1.0, 2.0, 3.0, 4.0).to[ArraySeq])._1.to[Vector]
Result:
res0: Vector[Double] = Vector(3.142, 1.0, 2.0, 3.0, 4.0)
Referential transparency means you can take an expression in your program, and replace it with the result of that expression. As an example, if you have the expression 5+4, you can replace that with 9. That's what it means. Now 5+4 is easy because you can do that in your head without running the program.
Some functional programming languages enforce referential transparency for all functions. The importance of referential transparency is that it allows the programmer and the compiler to reason about program behavior as a rewrite system.
Another benefit of referential transparency is that it eliminates side-effects from your code. Referential transparency requires that functions be free of any code that can modify the program state outside of the function.
All pure functions are necessarily referentially transparent. Since, by definition, they cannot access anything other than what they are passed, their result must be fully determined by their arguments. However, it is possible to have referentially transparent functions which are not pure.
I guess you could say that your update
itself is pure, in the sense that it only represents some mutation, but as soon as you run it all bets are off:
scala> val xs = List(1.0, 2.0, 3.0).to[ArraySeq]
xs: scala.collection.mutable.ArraySeq[Double] = ArraySeq(1.0, 2.0, 3.0)
scala> update(0, 10).eval(xs)
res0: scalaz.Id.Id[Unit] = ()
scala> xs
res1: scala.collection.mutable.ArraySeq[Double] = ArraySeq(10.0, 2.0, 3.0)
This is a bad scene, and it's the opposite of pure or referentially transparent.
State
isn't really buying you anything in your example—the fact that you're calling app
in such a way that you have an ArraySeq
that nobody else can mutate is. You might as well bite the bullet and work with a mutable data structure in the usual way in a scope that you control—i.e., write app
like this:
def app(xs: Vector[Double]): Vector[Double] = {
val arr = xs.to[ArraySeq]
// Perform all your updates in the usual way
arr.toVector
}
This actually is pure and referentially transparent, but it's also more honest than the State
version. If I see a value of type State[Foo, Unit]
, my assumption is going to be that this value represents some kind of operation that changes a Foo
into a new Foo
, without mutating the original Foo
. This is all the state monad is—it provides a nice way of modeling operations on immutable data structures and composing them in a way that looks kind of like mutation. If you mix it with actual mutation you're likely to confuse the hell out of anyone using your code.
If you really want real mutation and purity at the same time, you can look at Scalaz's STArray
. It's a very clever solution to this problem, and in languages like Haskell it's an approach that makes a lot of sense. My own feeling is that it's pretty much always the wrong solution in Scala, though. If you really need the performance of a mutable array, just use a local mutable array and make sure you don't leak it to the outside world. If you don't need that kind of performance (and most of the time you don't), use something like State
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With