I have a web server written in Haskell that computes some data in multiple steps.
I want to accurately measure and display how long each action takes.
In the presence of laziness, what is a good way to do this?
Note that "benchmarking" is not quite the right terminology since I only only want to measure time in a production system and not sample many runs. I know that for that case I can use criterion.
You can use force
from Control.DeepSeq
to fully evaluate a data structure (and thus demand and measure its computation).
One problem is that forcing a large data structure takes some time itself!
This is because a deepseq
(used by force
) will walk down your algebraic data type tree, visiting every node (but not doing anything with it).
When you perform only a cheap operation to each node, such as map (*2) mylist
, and try to measure how long it takes, this overhead can suddenly become significant, messing up your measurements.
import Control.DeepSeq
import Control.Exception (evaluate)
import Data.Time (diffUTCTime, getCurrentTime)
-- | Measures how long a computation takes, printing both the time and the
-- overhead of `force` to stdout. So it forces *twice*.
benchmarkForce :: NFData a => String -> IO a -> IO a
benchmarkForce msg action = do
before <- getCurrentTime
-- Force the first time to measure computation + forcing
result <- evaluate . force =<< action
after <- getCurrentTime
-- Force again to see how long forcing itself takes
_ <- evaluate . force $ result
afterAgain <- getCurrentTime
putStrLn $ msg ++ ": " ++ show (diffTimeMs before after) ++ " ms"
++ " (force time: " ++ show (diffTimeMs after afterAgain) ++ " ms)"
return result
where
-- Time difference `t2 - t1` in milliseconds
diffTimeMs t1 t2 = realToFrac (t2 `diffUTCTime` t1) * 1000.0 :: Double
The first evaluate . force
run will make sure your action
and its return value are evaluated entirely.
By doing a second force
run over the result, we can measure how much overhead it added to the first traversal.
This of course comes at the expense of two traversals; being able to measure how much time a deepseq
wastes requires you to waste that time twice.
Here is an example to measure some pure functions with that:
main :: IO ()
main = do
l <- benchmarkForce "create list" $
return [1..10000000 :: Integer]
_ <- benchmarkForce "double each list element" $
return $ map (*2) l
_ <- benchmarkForce "map id l" $
return $ map id l
return ()
(Of course it also works with functions in IO.)
The output:
create list: 1091.936 ms (force time: 71.33200000000001 ms)
double each list element: 1416.0569999999998 ms (force time: 96.808 ms)
map id l: 484.493 ms (force time: 67.232 ms)
As we can see, the force
creates around 13% overhead in the map id l
case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With