I wanted to fool a bit around with random numbers, if the random generator in haskell is distributed uniformly or not, thus I wrote the following program after a few tries (with lists being generated leading to stack overflow).
module Main where
import System.Environment (getArgs)
import Control.Applicative ((<$>))
import System.Random (randomRIO)
main :: IO ()
main = do nn <- map read <$> getArgs :: IO [Int]
let n = if null nn then 10000 else head nn
m <- loop n 0 (randomRIO (0,1))
putStrLn $ "True: "++show (m//n::Double) ++", False "++show ((n-m)//n :: Double)
return ()
loop :: Int -> Int -> IO Double -> IO Int
loop n acc x | n<0 = return acc
| otherwise = do x' <- round <$> x
let x'' = (x' + acc) in x'' `seq` loop (n-1) x'' x
(//) :: (Integral a, Fractional b) => a -> a -> b
x // y = fromIntegral x / fromIntegral y
as I got it to work ok-ish I decided to write another version - in java (which I am not very good in), and expected haskell to beat it, but the java program ran about half the time compared to the haskell version
import java.util.Random;
public class MonteCarlo {
public static void main(String[] args) {
int n = Integer.parseInt(args[0]);
Random r = new Random();
int acc = 0;
for (int i=0; i<=n; i++) {
acc += Math.round(r.nextDouble());
}
System.out.println("True: "+(double) acc/n+", False: "+(double)(n-acc)/n);
}
}
I tried to look at the profile of the haskell version - which told me that most of the work was done in the loop - no wonder! I tried to look at the core, but I really don't know enough to understand that. I figure that the java version, may be using more than one core - as the system used more than 100%, when I timed it.
I guess one could improve the code using unboxed Doubles/Ints, but again my knowledge of hakell is not up to that.
This is way too big to leave as a comment, but it's not really an answer, just some data. I took the main loop and tested it with criterion
. Then a few variations.
import Control.Applicative
import System.Random
import Criterion.Main
import Data.List
iters :: Int
iters = 10000
loop1 :: IO Int
loop1 = go iters 0
where
go n acc | n < 0 = return acc
| otherwise = do
x <- randomRIO (0, 1 :: Double)
let acc' = acc + round x
acc' `seq` go (n - 1) acc'
loop1' :: IO Int
loop1' = go iters 0
where
go n acc | n < 0 = return acc
| otherwise = do
x <- randomRIO (0, 1)
let acc' = acc + x
acc' `seq` go (n - 1) acc'
loop2 :: IO Int
loop2 = do
g <- newStdGen
let s = foldl' (+) 0 . take iters . map round $ randomRs (0, 1 :: Double) g
s `seq` return s
loop2' :: IO Int
loop2' = do
g <- newStdGen
let s = foldl' (+) 0 . take iters $ randomRs (0, 1) g
s `seq` return s
loop3 :: IO Int
loop3 = do
g0 <- newStdGen
let go n acc g | n < 0 = acc
| otherwise = let (x, g') = randomR (0, 1 :: Double) g
acc' = acc + round x
in acc' `seq` go (n - 1) acc g'
return $! go iters 0 g0
loop3':: IO Int
loop3'= do
g0 <- newStdGen
let go n acc g | n < 0 = acc
| otherwise = let (x, g') = randomR (0, 1) g
acc' = acc + x
in acc' `seq` go (n - 1) acc g'
return $! go iters 0 g0
main :: IO ()
main = defaultMain $
[ bench "loop1" $ whnfIO loop1
, bench "loop2" $ whnfIO loop2
, bench "loop3" $ whnfIO loop3
, bench "loop1'" $ whnfIO loop1'
, bench "loop2'" $ whnfIO loop2'
, bench "loop3'" $ whnfIO loop3'
]
And here are the timings: Except, well. I'm running on a virtualbox vm, and performance is kinda random. The overall trends were consistent, though.
carl@debian:~/hask$ ghc -Wall -O2 randspeed.hs
[1 of 1] Compiling Main ( randspeed.hs, randspeed.o )
Linking randspeed ...
carl@debian:~/hask$ ./randspeed
warming up
estimating clock resolution...
mean is 3.759461 us (160001 iterations)
found 3798 outliers among 159999 samples (2.4%)
1210 (0.8%) low severe
2492 (1.6%) high severe
estimating cost of a clock call...
mean is 2.152186 us (14 iterations)
found 3 outliers among 14 samples (21.4%)
1 (7.1%) low mild
1 (7.1%) high mild
1 (7.1%) high severe
benchmarking loop1
mean: 15.88793 ms, lb 15.41649 ms, ub 16.37845 ms, ci 0.950
std dev: 2.472512 ms, lb 2.332036 ms, ub 2.650680 ms, ci 0.950
variance introduced by outliers: 90.466%
variance is severely inflated by outliers
benchmarking loop2
mean: 26.44217 ms, lb 26.28822 ms, ub 26.64457 ms, ci 0.950
std dev: 905.7558 us, lb 713.3236 us, ub 1.165090 ms, ci 0.950
found 8 outliers among 100 samples (8.0%)
6 (6.0%) high mild
2 (2.0%) high severe
variance introduced by outliers: 30.636%
variance is moderately inflated by outliers
benchmarking loop3
mean: 18.43004 ms, lb 18.29330 ms, ub 18.60769 ms, ci 0.950
std dev: 794.3779 us, lb 628.6630 us, ub 1.043238 ms, ci 0.950
found 5 outliers among 100 samples (5.0%)
4 (4.0%) high mild
1 (1.0%) high severe
variance introduced by outliers: 40.516%
variance is moderately inflated by outliers
benchmarking loop1'
mean: 4.579197 ms, lb 4.494131 ms, ub 4.677335 ms, ci 0.950
std dev: 468.0648 us, lb 406.8328 us, ub 558.5602 us, ci 0.950
found 2 outliers among 100 samples (2.0%)
2 (2.0%) high mild
variance introduced by outliers: 80.019%
variance is severely inflated by outliers
benchmarking loop2'
mean: 4.473382 ms, lb 4.386545 ms, ub 4.567254 ms, ci 0.950
std dev: 460.5377 us, lb 410.1520 us, ub 543.1835 us, ci 0.950
found 1 outliers among 100 samples (1.0%)
variance introduced by outliers: 80.033%
variance is severely inflated by outliers
benchmarking loop3'
mean: 3.577855 ms, lb 3.490043 ms, ub 3.697916 ms, ci 0.950
std dev: 522.4125 us, lb 416.7015 us, ub 755.3713 us, ci 0.950
found 5 outliers among 100 samples (5.0%)
4 (4.0%) high mild
1 (1.0%) high severe
variance introduced by outliers: 89.406%
variance is severely inflated by outliers
Here are my conclusions:
randomRs
is slow, due to list creation overhead. Too bad lists don't fuse with foldl'
, that'd make this faster and simpler.IO
. It shows up the version that uses Int
instead of double, but that's it.Random
instance for Double
is super-slow. Or else round
is super-slow. Or maybe the combination of the two is.I have tried a crude version of your code relying on laziness:
module Main where
import System.Environment
import Control.Applicative
import System.Random
main :: IO ()
main = do
args <- getArgs
let n = if null args then 10000 else read $ head args
g <- getStdGen
let vals = randomRs (0, 1) g :: [Int]
let s = sum $ take n vals
putStrLn $ "True: " ++ f s n ++ ", False" ++ f (n - s) n
f x y = show $ ((fromIntegral x / fromIntegral y) :: Double)
For now, ignore the fact that I've missed some type declarations and I have imported everything from the modules. I just wanted to be free to test.
Back at the castle, your version was saved as original.hs
while the above was saved as 1.hs
. Testing time:
[mihai@esgaroth so]$ ghc --make -O2 original.hs
[1 of 1] Compiling Main ( original.hs, original.o )
Linking original ...
[mihai@esgaroth so]$ ghc --make -O2 1.hs
[1 of 1] Compiling Main ( 1.hs, 1.o )
Linking 1 ...
[mihai@esgaroth so]$ time ./original
True: 0.4981, False 0.5019
real 0m0.022s
user 0m0.021s
sys 0m0.000s
[mihai@esgaroth so]$ time ./1
True: 0.4934, False0.5066
real 0m0.005s
user 0m0.003s
sys 0m0.001s
[mihai@esgaroth so]$ time ./original
True: 0.5063, False 0.4937
real 0m0.018s
user 0m0.017s
sys 0m0.001s
[mihai@esgaroth so]$ time ./1
True: 0.5024, False0.4976
real 0m0.005s
user 0m0.003s
sys 0m0.002s
Everytime, the new code was 4 time faster. And that is while still having the first version of using lazy constructs and already existing code.
Next step is to test the performance heaps and to see if it is worth to embed the sum computation when generating the random list.
PS: On my machine:
[mihai@esgaroth so]$ time java MonteCarlo 10000
True: 0.5011, False: 0.4989
real 0m0.063s
user 0m0.066s
sys 0m0.010s
PPS: Running the code compiled without -O2
:
[mihai@esgaroth so]$ time ./original
True: 0.5035, False 0.4965
real 0m0.032s
user 0m0.031s
sys 0m0.001s
[mihai@esgaroth so]$ time ./1
True: 0.4975, False0.5025
real 0m0.014s
user 0m0.010s
sys 0m0.003s
Only a 2 time reduction but still faster than java.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With