Haskell: How to benchmark a computation accurately with deepseq/force

Question

I have a web server written in Haskell that computes some data in multiple steps.

I want to accurately measure and display how long each action takes.

In the presence of laziness, what is a good way to do this?

Note that "benchmarking" is not quite the right terminology since I only only want to measure time in a production system and not sample many runs. I know that for that case I can use criterion.

nh2 · Accepted Answer

You can use force from Control.DeepSeq to fully evaluate a data structure (and thus demand and measure its computation).

One problem is that forcing a large data structure takes some time itself!

This is because a deepseq (used by force) will walk down your algebraic data type tree, visiting every node (but not doing anything with it).

When you perform only a cheap operation to each node, such as map (*2) mylist, and try to measure how long it takes, this overhead can suddenly become significant, messing up your measurements.

import Control.DeepSeq
import Control.Exception (evaluate)
import Data.Time (diffUTCTime, getCurrentTime)


-- | Measures how long a computation takes, printing both the time and the
-- overhead of `force` to stdout. So it forces *twice*.
benchmarkForce :: NFData a => String -> IO a -> IO a
benchmarkForce msg action = do
    before <- getCurrentTime

    -- Force the first time to measure computation + forcing
    result <- evaluate . force =<< action

    after <- getCurrentTime

    -- Force again to see how long forcing itself takes
    _ <- evaluate . force $ result

    afterAgain <- getCurrentTime
    putStrLn $ msg ++ ": " ++ show (diffTimeMs before after) ++ " ms"
                   ++ " (force time: " ++ show (diffTimeMs after afterAgain) ++ " ms)"
    return result

    where
        -- Time difference `t2 - t1` in milliseconds
        diffTimeMs t1 t2 = realToFrac (t2 `diffUTCTime` t1) * 1000.0 :: Double

The first evaluate . force run will make sure your action and its return value are evaluated entirely.

By doing a second force run over the result, we can measure how much overhead it added to the first traversal.

This of course comes at the expense of two traversals; being able to measure how much time a deepseq wastes requires you to waste that time twice.

Here is an example to measure some pure functions with that:

main :: IO ()
main = do

    l <- benchmarkForce "create list" $
        return [1..10000000 :: Integer]

    _ <- benchmarkForce "double each list element" $
        return $ map (*2) l

    _ <- benchmarkForce "map id l" $
        return $ map id l

    return ()

(Of course it also works with functions in IO.)

The output:

create list: 1091.936 ms (force time: 71.33200000000001 ms)
double each list element: 1416.0569999999998 ms (force time: 96.808 ms)
map id l: 484.493 ms (force time: 67.232 ms)

As we can see, the force creates around 13% overhead in the map id l case.

Haskell: How to benchmark a computation accurately with deepseq/force

Tags:

benchmarking

haskell

lazy-evaluation

nh2

1 Answers

nh2

Recent Activity

Donate For Us

Haskell: How to benchmark a computation accurately with deepseq/force

Tags:

benchmarking

haskell

lazy-evaluation

nh2

1 Answers

nh2

Related questions

Recent Activity

Donate For Us