Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How much overhead do function calls have in Haskell?

I am new to Haskell and I am puzzled by the cost of a function call, which seems to be completely unreasonable to me, and makes me think I am doing something fundamentally wrong.

Consider the following Haskell code:

module Main where
logistic x = 4.0*x*(1.0-x)

lg :: Double -> Int -> Double
lg !x 0 = x
lg !x !n = lg (logistic x) (n-1)

main = putStrLn $ show $ lg 0.7861 100000000

Compiling this with the command

 ghc -O3 -XBangPatterns -o tsths tst.hs

and running it, I get:

real    0m15.904s
user    0m15.853s
sys     0m0.016s

If instead of calling the function logistic I calculate the expression inline:

module Main where

lg :: Double -> Int -> Double
lg !x 0 = x
lg !x !n = lg (4.0*x*(1.0-x)) (n-1)

main = putStrLn $ show $ lg 0.7861 100000000

the execution time becomes:

real    0m0.838s
user    0m0.828s
sys     0m0.004s

which is exactly the same as the equivalent C program, which is

#include <stdio.h>

int main() {
   int i, num=100000000;
   double x=0.7861;
   for (i=0; i<num; ++i)
      x *= 4.0*(1.0-x);
   printf("%lg\n", x);
}

Am I doing something terribly wrong?

Many thanks.

like image 895
user1654556 Avatar asked Sep 07 '12 15:09

user1654556


People also ask

How much overhead is a function call?

Long story short, the overhead of a direct (non-virtual) function call was approximately 5.5 nanoseconds, or 18 clock cycles, compared to an inline function call. The overhead of a virtual function call was 13.2 nanoseconds, or 42 clock cycles, compared to inline.

Do function calls have an overhead?

1) Function call overhead doesn't occur. 2) It also saves the overhead of push/pop variables on the stack when function is called. 3) It also saves overhead of a return call from a function. 4) When you inline a function, you may enable compiler to perform context specific optimization on the body of function.

Are function calls expensive?

Speaking from personal experience, I write code in a proprietary language that is fairly modern in terms of capability, but function calls are ridiculously expensive, to the point where even typical for loops have to be optimized for speed: for(Integer index = 0, size = someList.

Do functions slow down code?

Yes method calls slow down the code execution a tiny little bit, if they a not inlined by the c#-compiler or the jit-compiler. However, unless your code runs in a loop and is executed a million times or so, you should really focus on producing clean, understandable and maintainable code.


3 Answers

It's a bug in GHC-7.4.1. Looking at the generated core (only the core for the function lg is important, from GHC-7.4.2, we get

Main.lg3 :: GHC.Types.Double
[GblId,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False,
         ConLike=False, Cheap=False, Expandable=False,
         Guidance=IF_ARGS [] 30 0}]
Main.lg3 = GHC.Float.$w$cfromRational Main.lg4 Main.lg2

Main.lg1 :: GHC.Types.Double
[GblId,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=0, Value=False,
         ConLike=False, Cheap=False, Expandable=False,
         Guidance=IF_ARGS [] 30 0}]
Main.lg1 = GHC.Float.$w$cfromRational Main.lg2 Main.lg2

Main.$wlg :: GHC.Prim.Double# -> GHC.Prim.Int# -> GHC.Prim.Double#
[GblId,
 Arity=2,
 Str=DmdType LL,
 Unf=Unf{Src=<vanilla>, TopLvl=True, Arity=2, Value=True,
         ConLike=True, Cheap=True, Expandable=True,
         Guidance=IF_ARGS [0 30] 158 0}]
Main.$wlg =
  \ (ww_s1Oy :: GHC.Prim.Double#) (ww1_s1OC :: GHC.Prim.Int#) ->
    case ww1_s1OC of ds_Xvs {
      __DEFAULT ->
        case Main.lg3 of _ { GHC.Types.D# x_awJ ->
        case Main.lg1 of _ { GHC.Types.D# x1_awV ->
        letrec {
          $wlg1_X1PF [Occ=LoopBreaker]
            :: GHC.Prim.Double# -> GHC.Prim.Int# -> GHC.Prim.Double#
          [LclId, Arity=2, Str=DmdType LL]
          $wlg1_X1PF =
            \ (ww2_X1Pv :: GHC.Prim.Double#) (ww3_X1PA :: GHC.Prim.Int#) ->
              case ww3_X1PA of ds1_Xwr {
                __DEFAULT ->
                  $wlg1_X1PF
                    (GHC.Prim.*##
                       (GHC.Prim.*## x_awJ ww2_X1Pv) (GHC.Prim.-## x1_awV ww2_X1Pv))
                    (GHC.Prim.-# ds1_Xwr 1);
                0 -> ww2_X1Pv
              }; } in
        $wlg1_X1PF
          (GHC.Prim.*##
             (GHC.Prim.*## x_awJ ww_s1Oy) (GHC.Prim.-## x1_awV ww_s1Oy))
          (GHC.Prim.-# ds_Xvs 1)
        }
        };
      0 -> ww_s1Oy
    }

two top-level Doubles and a decent loop.

GHC-7.4.1 was a bit too inlining-happy, that produced

Rec {
Main.$wlg [Occ=LoopBreaker]
  :: GHC.Prim.Double# -> GHC.Prim.Int# -> GHC.Prim.Double#
[GblId, Arity=2, Str=DmdType LL]
Main.$wlg =
  \ (ww_s1NS :: GHC.Prim.Double#) (ww1_s1NW :: GHC.Prim.Int#) ->
    case ww1_s1NW of ds_Xvb {
      __DEFAULT ->
        case GHC.Float.$wfromRat'' (-1021) 53 Main.logistic4 Main.logistic2
        of ww2_a1Mt { __DEFAULT ->
        case GHC.Float.$wfromRat'' (-1021) 53 Main.logistic2 Main.logistic2
        of ww3_X1Nq { __DEFAULT ->
        Main.$wlg
          (GHC.Prim.*##
             (GHC.Prim.*## ww2_a1Mt ww_s1NS) (GHC.Prim.-## ww3_X1Nq ww_s1NS))
          (GHC.Prim.-# ds_Xvb 1)
        }
        };
      0 -> ww_s1NS
    }
end Rec }

and gave you two calls to the fromRational worker in each iteration.

Now, fromRational is a fairly complicated function. It is still rather slow, despite having gotten a much faster implementation in the 7.2 series, so these calls hurt big time.

With a type signature, there are no Rational top-level constants produced, only Double constants, and these are then used, which of course doesn't include a gratuitous slowdown.

like image 117
Daniel Fischer Avatar answered Oct 11 '22 21:10

Daniel Fischer


As suggested by Dan Burton it is actually overhead of polymorphic function, because GHC infers type logistic :: Fractional a => a -> a. If you specify type explicitly you generally enable both better checking and better optimizations. I believe it is good practice to specify type of function explicitly.

If you want to have function with polymorphic type but have full speed of monomorphic call in case of specific usage you can use SPECIALIZE pragma, but I believe this is GHC specific.

{-# LANGUAGE BangPatterns #-}
module Main where
logistic :: Fractional a => a -> a
{-# SPECIALISE logistic :: Double -> Double #-}
logistic x = 4.0*x*(1.0-x)

lg :: Double -> Int -> Double
lg !x 0 = x
lg !x !n = lg (logistic x) (n-1)

main = putStrLn $ show $ lg 0.7861 100000000

Also note that you can specify LANGUAGE pragma at the beginning of file to enable bang patterns and don't need to enable them on command line.

Times on my machine were 21 seconds for original, 0.67 sec for explicit type, 0.7 sec for specialize (which is basically the same).

I believe overhead of specialized call is very small because it is just bunch of instructions which gets inlined anyway but polymorphic function results in call. Though it is weird that GHC cant inline despite polymorphism.

like image 39
Vladimir Still Avatar answered Oct 11 '22 20:10

Vladimir Still


Add a type signature to logistic and you will see the speedup. Allow me to use CPP to demonstrate the difference.

bash> cat tst.hs
module Main where

#if defined(SIG)
logistic :: Double -> Double
#endif
logistic x = 4.0*x*(1.0-x)

lg :: Double -> Int -> Double
lg !x 0 = x
lg !x !n = lg (logistic x) (n-1)

main = putStrLn $ show $ lg 0.7861 100000000


bash> ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.4.1

If compiled without SIG defined (the type signature is excluded):

bash> ghc -O3 -XBangPatterns -XCPP -o tsths tst.hs
[1 of 1] Compiling Main             ( tst.hs, tst.o )
Linking tsths ...


bash> time ./tsths
0.34209286442469333

real    0m13.187s
user    0m13.177s
sys 0m0.008s

Now lets compile with SIG defined so the signature is included:

bash> rm tsths *.o *.hi


bash> ghc -O3 -XBangPatterns -XCPP -DSIG -o tsths tst.hs
[1 of 1] Compiling Main             ( tst.hs, tst.o )
Linking tsths ...
bash> time ./tsths
0.34209286442469333

real    0m0.464s
user    0m0.440s
sys 0m0.020s

Not sure why GHC doesn't optimize it without the signature; the monomorphism restriction should restrict it to Double -> Double anyways.

like image 35
Dan Burton Avatar answered Oct 11 '22 19:10

Dan Burton