Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Haskell `randoms` function not behaving well with my library

Tags:

random

haskell

I'm trying to write a Haskell library for cryptographically secure random numbers. The code follows:

module URandom (URandom, initialize) where

import qualified Data.ByteString.Lazy as B
import System.Random
import Data.Word

newtype URandom = URandom [Word8]

instance RandomGen URandom where
  next (URandom (x : xs)) = (fromIntegral x, URandom xs)
  split (URandom l) = (URandom (evens l), URandom (odds l))
    where evens (x : _ : xs) = x : evens xs
          odds (_ : x : xs) = x : odds xs
  genRange _ = (fromIntegral (minBound :: Word8), fromIntegral (maxBound :: Word8))

initialize :: IO URandom
initialize = URandom . B.unpack <$> B.readFile "/dev/urandom"

Unfortunately, it's not behaving like I want. In particular, performing

take 10 . randoms <$> initialize

yields (something similar to)

[-4611651379516519433,-4611644973572935887,-31514321567846,9223361179177989878,-4611732094835278236,9223327886739677537,4611709625714976418,37194416358963,4611669560113361421,-4611645373004878170,-9223329383535098640,4611675323959360258,-27021785867556,9223330964083681227,4611705212636167666]

which to my, albiet untrained, eye, does not appear very random. A lot of 46... and 92... in there.

What could be going wrong? Why doesn't this produce well-distributed numbers? It's worth noting that even if I concatenate together Word8s to form Ints the distribution does not improve, I didn't think it was worth including that code here.

Edit: here's some evidence that's not distributed correctly. I've written a function called histogram:

histogram :: ∀ t . (Integral t, Bounded t)
          => [t] -> Int -> S.Seq Int
histogram [] buckets = S.replicate buckets 0
histogram (x : xs) buckets = S.adjust (+ 1) (whichBucket x) (histogram xs buckets)
  where whichBucket x = fromIntegral $ ((fromIntegral x * fromIntegral buckets) :: Integer) `div` fromIntegral (maxBound :: t)

and when I run

g <- initialize
histogram (take 1000000 $ randoms g :: [Word64]) 16

I get back

fromList [128510,0,0,121294,129020,0,0,122090,127873,0,0,120919,128637,0,0,121657]

Some of the buckets are completely empty!

like image 225
Program man Avatar asked May 25 '17 18:05

Program man


1 Answers

The issue is a bug in random-1.0.1.1 that was fixed in random-1.1. The changelog points to this ticket. In particular, referring to the older version:

It also assumes that all RandomGen implementations produce the same range of random values as StdGen.

Here randomness is produced 8 bits at a time, and that caused the observed behavior.

random-1.1 fixed this:

This implementation also works with any RandomGen, even ones that produce as little as a single bit of entropy per next call or have a minimum bound other than zero.

like image 99
Li-yao Xia Avatar answered Nov 20 '22 21:11

Li-yao Xia