Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why `Random.nextInt` is considered not ' composable, modular, easily parallelized'

From the book "functional programming in Scala", there is an section about how to write the random number generator in functional way.

It gives an example about scala.util.Random:

Even if we didn’t know anything about what happens inside scala.util.Random, we can assume that the object rng has some internal state that gets updated after each invocation, since we’d otherwise get the same value each time we called nextInt or nextDouble. Because the state updates are performed as a side effect, these methods aren’t referentially transparent. And as we know, this implies that they aren’t as testable, composable, modular, and easily parallelized as they could be.

The last sentence, it says that is not:

  1. testable
  2. composable
  3. modular
  4. easily parallelized

In the later content, it explained the testable part, but how to understand the other 3 aspects?

like image 491
Freewind Avatar asked Jul 15 '15 13:07

Freewind


1 Answers

1. Testable

You cannot easily isolate its state, especially when tests are running in parallel. This makes it impossible to reproduce a sequence of random numbers in a predictable manner when running tests in parallel.

2. Composable

You cannot build a sequence of operations that first seed the random number with a known value and then get a sequence with regards of this seed (especially when running in parallel).

While this might overlap with "testable", it might also be necessary for security reasons, so you don't accidentally leak the state from one session to a different session.

3. Modular

There is no way to swap in different algorithms of randomization. All modules have to use the same algorithm. This overlaps with "composable" in the sense that instead of a seed you also cannot provide the generator as a (possibly implicit) parameter.

4. Parallelized

There is only one state for all threads. In a highly concurrent situation, this might be a serious bottleneck.

Some example code

Please don't use this code in production. It's quite clumsy and with scalaz or cats you can easily build something better. Typically a random number generator is a comonad, it's state is hold in a monad.

First let's make the code modular by providing different pseudorandom generators and different seeds

// This is a commonly used pseudo-random generator
// https://en.wikipedia.org/wiki/Linear_congruential_generator
// This implementation is awful slow!
def lcg(multiplier : BigInt, increment : BigInt, dividend : BigInt)(seed : BigInt) =
  (seed * multiplier + increment) mod dividend

// Some commonly used pseudo random number generators
def glibRand = lcg(multiplier = 1103515245,   increment = 12345,   dividend = 2^31) _
def vb6Rand  = lcg(multiplier = 214013,       increment = 2531011, dividend = 2^31) _
def javaRand = lcg(multiplier = 25214903917L, increment = 11,      dividend = 2L^48) _


// This is really insecure, don't use it for anything serious!
def seedFromTime() = BigInt(System.nanoTime())

// I used a dice to determine this value, so it's random!
def randomSeed = 5 

Then let's build an implementation that can compose functions:

case class Random[T,U](val generator : T => T, val seed : T)(val currentValue : U){
 def apply[V](f : (T,U) => V) : Random[T,V] = {
   val nextRandom = generator(seed)
   Random(generator, nextRandom)(f(nextRandom, currentValue))
  }

  def unsafeGetValue = currentValue
}

A few example where we use this building blocks to generate a list of random numbers and sum some random numbers:

val randomNumbers  = Stream.iterate(Random(glibRand, seedFromTime())(List.empty[BigInt])){r : Random[BigInt, List[BigInt]] =>
  r.apply{(rnd : BigInt, numbers : List[BigInt]) =>
    rnd :: numbers
  }
}

randomNumbers.take(10).last.unsafeGetValue

val randomSum = Stream.iterate(Random(glibRand, seedFromTime())(BigInt(0))) { r: Random[BigInt, BigInt] =>
  r.apply(_ + _)
}

randomSum.take(100).last.unsafeGetValue

As usual, when we want to compose something in a functional language, we will compose functions to reach different goals. If we would have provided a monadic implementation, we could also combine functions with side effects.

Parallelization and testability follow automatically in this case (as usual).

like image 182
stefan.schwetschke Avatar answered Nov 15 '22 22:11

stefan.schwetschke