Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does a more general type affect runtime in Haskell?

Tags:

types

haskell

ghc

Consider the two following implementations of an infinite Fibonacci sequence:

fibsA :: Num a => [a]
fibsA = 0:1:(zipWith (+) fibsA (tail fibsA))

fibsB :: [Integer]
fibsB = 0:1:(zipWith (+) fibsB (tail fibsB))

In GHCI, executing fibsB !! k is much faster than executing fibsA !! k. In particular, it seems that the values of fibsA are continuously recalculated (not memoized/stored).

Furthermore, when the type signature is omitted, GHCI's :t shows it to be [Integer], and the function performs accordingly.

This behavior also occurs in compiled code (ghc -O3 Fibs.hs).

Why is it the case that Integer is so much faster than Num a => a?

like image 460
wchargin Avatar asked Dec 29 '14 03:12

wchargin


1 Answers

When you write fibsA :: Num a => [a], the compiler constructs what is essentially

fibsA :: NumDict a -> [a]

Where

data NumDict a = NumDict
    { (+)         :: a -> a -> a
    , (-)         :: a -> a -> a
    , (*)         :: a -> a -> a
    , negate      :: a -> a
    , abs         :: a -> a
    , signum      :: a -> a
    , fromInteger :: Integer -> a
    }

Notice that Num a has moved from being a constraint to being an argument to the function. A typeclass is essentially just a lookup table for each type that implements the class. So for Num, you'd have by default

mkInteger_NumDict :: NumDict Integer
mkInteger_NumDict = NumDict
    { (+) = integer_plus
    , (-) = integer_minus
    , (*) = integer_mult
    , ...
    }

mkInt_NumDict     :: NumDict Int

mkFloat_NumDict   :: NumDict Float

mkDouble_NumDict  :: NumDict Double

and these get automatically passed to a function using a typeclass when the instance is resolved. This means that our function fibsA essentially takes an argument. When you call it from GHCi, the defaulting rules kick in and pick Integer, but since it's being called this way it would look more like this internally:

{-# RecordWildCards #-}  -- To reduce typing

fibsA :: NumDict a -> [a]
fibsA nd@(NumDict{..}) = fromInteger 0 : fromInteger 1 : zipWith (+) (fibsA nd) (tail $ fibsA nd)

Do you see the problem with this? It's still recursive, but now it has to make a function call each step of the way, reducing performance. If you wanted to make it really fast, a smart programmer would do

fibsA nd@(NumDict{..}) = fromInteger 0 : fromInteger 1 : zipWith (+) fibsA' (tail fibsA')
    where fibsA' = fibsA nd

This at least allows memoization. However, a haskell binary can't really perform this optimization at runtime, that happens at compile time. So what you end up with is a slower recursive function. With fibsB, you're specifying the type concretely, there are no polymorphic constraints on it's type signature. The value fibsB has no implicit or explicit arguments, so when referred to it's a pointer to the same object in memory. fibsA is a pointer to a function, so when used recursively it returns new objects in memory, and has no memoization. This is why fibsB is faster than fibsA, only fibsB gets optimized because the compiler doesn't have to make it work for all Num, only Integer.

like image 132
bheklilr Avatar answered Nov 16 '22 01:11

bheklilr