Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to generate a billion random doubles in Haskell

I'm doing Monte-Carlo simulations, and currently using System.Random.

import System.Random
main = do
  g <- newStdGen
  let xs = randoms g :: [Double]
  -- normally, I'd do other magic here
  putStrLn $ show $ length $ take 10^9 xs

Unfortunately, this takes a really long time, at least 5x slower than Python's random.random(), to say nothing of the C rand() call.

With ghc -O2 -optc-ffast-math -optc-O3

import System.Random
main = do
  g <- newStdGen
  let xs = randoms h :: [Double]
  putStrLn $ show $ length $ take (10^7) xs

takes ~8s vs. (in iPython)

import random
%timeit len([random.random() for _ in range(10 ** 7)])

takes ~1.3s. My goal is one billion, but Haskell cannot generate them in a reasonable amount of time.

I also have a C++ program that generates floats with rand(). It does 10^7 samples in 0.2s.

How can I generate random doubles in the range [0-1) quickly in Haskell?

Ideally, the program GHC generates will just blast rand() POSIX calls and collect into a list. The answer with the cleanest & fastest code wins. (No, having 10x the code for 1% speedup isn't worth it.)

like image 212
PythonNut Avatar asked Nov 01 '22 16:11

PythonNut


1 Answers

Here's Mersenne which surprisingly seemed to be faster than MWC and beats C++ although we are on different computers ;-). It's tempting to see how much parallelising it would buy but I had better go back to work.

{-# LANGUAGE BangPatterns #-}
{-# OPTIONS_GHC -Wall                      #-}
{-# OPTIONS_GHC -fno-warn-name-shadowing   #-}
{-# OPTIONS_GHC -fno-warn-type-defaults    #-}

import System.Random.Mersenne.Pure64

testUniform :: Int -> Double -> PureMT -> Double
testUniform 0 !x _ = x
testUniform n !x gen =
    testUniform (n - 1) (x + y) gen'
  where
    (y, gen') = randomDouble gen

n :: Int
n = 10^7

total :: Double
total = testUniform n 0 (pureMT $ fromIntegral arbSeed)

arbSeed :: Int
arbSeed = 8

mean :: Double
mean = total / fromIntegral n

main :: IO ()
main = print mean

~/Dropbox/Private/Stochastic $ ./MersennePure +RTS -s
0.4999607889729769
     802,924,992 bytes allocated in the heap
         164,240 bytes copied during GC
          44,312 bytes maximum residency (2 sample(s))
          21,224 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      1634 colls,     0 par    0.00s    0.01s     0.0000s    0.0000s
  Gen  1         2 colls,     0 par    0.00s    0.00s     0.0001s    0.0002s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    0.11s  (  0.11s elapsed)
  GC      time    0.00s  (  0.01s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    0.12s  (  0.12s elapsed)

  %GC     time       4.2%  (5.4% elapsed)

  Alloc rate    7,336,065,126 bytes per MUT second

  Productivity  95.7% of total user, 93.5% of total elapsed
like image 97
idontgetoutmuch Avatar answered Nov 15 '22 08:11

idontgetoutmuch