Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Forcing evaluation of function input before benchmarking in Criterion

How do you force evaluation of the input of a function before benchmarking the function in Criterion? I am trying to benchmark some functions, but will like to exclude time to evaluate input thunk. The code in question uses unboxed vectors for input, which can't be deepseq'd for the Int vectors. Example code snippet below:

-- V is Data.Vector.Unboxed
shortv = V.fromList [1..10] :: V.Vector GHC.Int.Int16
intv = V.fromList [1..10] :: V.Vector GHC.Int.Int32

main :: IO ()
main = defaultMain [
          bench "encode ShortV" $ whnf encodeInt16V shortv
          ,bench "encode IntV" $ whnf encodeInt32V intv
       ]

Criterion benchmark includes time to build shortv, and intv inputs when benchmarking above functions. The criterion measurements are below - it measures about ~400ns for each function which seems to include build time for inputs as well:

benchmarking encode ShortV
mean: 379.6917 ns, lb 378.0229 ns, ub 382.4529 ns, ci 0.950
std dev: 10.79084 ns, lb 7.360444 ns, ub 15.89614 ns, ci 0.950

benchmarking encode IntV
mean: 392.2736 ns, lb 391.2816 ns, ub 393.4853 ns, ci 0.950
std dev: 5.565134 ns, lb 4.694539 ns, ub 6.689224 ns, ci 0.950 

Now, if benchmark code's main section is modified to below (by removing second bench function):

main = defaultMain [
          bench "encode ShortV" $ whnf encodeInt16V shortv
       ]

shortv input seems to be evaluated before encodeInt16V function is benchmarked. That is the desired output indeed for me because this benchmark measures the time for the function execution, excluding the time to build the input. Criterion output below:

benchmarking encode ShortV
mean: 148.8488 ns, lb 148.4714 ns, ub 149.6279 ns, ci 0.950
std dev: 2.658834 ns, lb 1.621119 ns, ub 5.184792 ns, ci 0.950

Similarly, if I benchmark only "encode IntV" benchmark, I get ~150ns time for that one too.

I know from Criterion documentation that it tries to avoid lazy evaluation for more accurate benchmarking. It makes sense, and is not really an issue here. My question is how do I build the shortv and intv inputs so that they are already evaluated before being passed to bench function. Right now, I can accomplish this by restricting defaultMain to benchmark only one function at a time (as I just showed above), but that is not an ideal solution.

EDIT1

There is something else going on here with Criterion benchmark, and it seems to happen only on Vector array, not lists. If I force full evaluation by printing shortv and intv, the benchmark still measures the time as ~400ns, not ~150ns. Code update below:

main = do
  V.forM_ shortv $ \x -> do print x
  V.forM_ intv $ \x -> do print x
  defaultMain [
          bench "encode ShortV" $ whnf encodeInt16V shortv
          ,bench "encode IntV" $ whnf encodeInt32V intv
       ]

Criterion output (also, has 158.4% outliers which seems incorrect):

estimating clock resolution...
mean is 5.121819 us (160001 iterations)
found 253488 outliers among 159999 samples (158.4%)
  126544 (79.1%) low severe
  126944 (79.3%) high severe
estimating cost of a clock call...
mean is 47.45021 ns (35 iterations)
found 5 outliers among 35 samples (14.3%)
  2 (5.7%) high mild
  3 (8.6%) high severe

benchmarking encode ShortV
mean: 382.1599 ns, lb 381.3501 ns, ub 383.0841 ns, ci 0.950
std dev: 4.409181 ns, lb 3.828800 ns, ub 5.216401 ns, ci 0.950

benchmarking encode IntV
mean: 394.0517 ns, lb 392.4718 ns, ub 396.7014 ns, ci 0.950
std dev: 10.20773 ns, lb 7.101707 ns, ub 17.53715 ns, ci 0.950
like image 605
Sal Avatar asked Dec 04 '11 22:12

Sal


1 Answers

You could use evaluate before calling defaultMain to run the benchmarks. Not sure if it is the cleanest solution, but it would look like this:

main = do
  evaluate shortv
  evaluate intv
  defaultMain [..]
like image 184
Erik Hesselink Avatar answered Oct 23 '22 07:10

Erik Hesselink