Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does HasCallStack influence the performance of a normal branch in Haskell?

Generating a call stack when reaching the error branch has runtime cost; that is easy to understand.

But will the HasCallStack constraint also influence the performance of a normal branch? How?

like image 235
luochen1990 Avatar asked Aug 13 '19 05:08

luochen1990


1 Answers

The effect of adding a HasCallStack constraint to a function foo is more or less equivalent to:

  • adding an additional input argument for the call stack to foo's argument list;
  • wherever foo is invoked, constructing a call stack argument for it, by pushing an information frame (consisting of the function name "foo" and the source location where it was invoked) onto the input call stack (if foo is invoked from another function with a HasCallStack constraint) or onto an empty call stack (if it is invoked from a function without a HasCallStack constraint).

So... if you have some functions:

foo :: HasCallStack => Int -> String -> String
foo n = bar n '*'

bar :: HasCallStack => Int -> Char -> String -> String
bar n c str = if n >= 0 then c' ++ ' ':str ++ ' ':c'
              else error "bad n"
  where c' = replicate n c

baz :: String
baz = foo 3 "hello"

then adding HasCallStack to foo and bar (but leaving baz alone) has basically the same effect as if you'd written:

foo cs n = bar cs' n
  where cs' = pushCallStack ("bar", <loc>) cs
bar cs n c str
  = if n >= 0 then c' ++ ' ':str ++ ' ':c'
    else error cs' "bad n"
  where c' = replicate n c
        cs' = pushCallStack ("error", <loc>) cs
baz = foo cs' 3 "hello"
  where cs' = pushCallStack ("foo", <loc>) emptyCallStack

So, the baseline, unoptimized performance cost is the cost of an extra parameter for each function decorated with HasCallStack plus the cost of a thunk allocation to supply that parameter for every invocation point of the decorated function. (These costs are paid even if no error is triggered.)

In practice, optimized code will be... erm... optimized. For example, if the above example is compiled with -O2, foo will be inlined and bar will be specialized in the definition of baz in such a way that the only runtime cost of the call stack is that a static pointer (to a thunk for creating the full call stack for the error call) gets passed to the specialized version of bar (but ignored, since no error is generated).

GHC doesn't seem to be smart enough to determine that baz will never follow the error case and so doesn't need the stack frame at all.

like image 130
K. A. Buhr Avatar answered Sep 24 '22 01:09

K. A. Buhr