Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GHC code generation for type class function calls

In Haskell to define an instance of a type class you need to supply a dictionary of functions required by the type class. I.e. to define an instance of Bounded, you need to supply a definition for minBound and maxBound.

For the purpose of this question, let's call this dictionary the vtbl for the type class instance. Let me know if this is poor analogy.

My question centers around what kind of code generation can I expect from GHC when I call a type class function. In such cases I see three possibilities:

  1. the vtbl lookup to find the implementation function is down at run time
  2. the vtbl lookup is done at compile time and a direct call to the implementation function is emitted in the generated code
  3. the vtbl lookup is done at compile time and the implementation function is inlined at the call site

I'd like to understand when each of these occur - or if there are other possibilities.

Also, does it matter if the type class was defined in a separately compiled module as opposed to being part of the "main" compilation unit?

In a runnable program it seems that Haskell knows the types of all the functions and expressions in the program. Therefore, when I call a type class function the compiler should know what the vtbl is and exactly which implementation function to call. I would expect the compiler to at least generate a direct call to implementation function. Is this true?

(I say "runnable program" here to distinguish it from compiling a module which you don't run.)

like image 702
ErikR Avatar asked Sep 28 '12 18:09

ErikR


2 Answers

As with all good questions, the answer is "it depends". The rule of thumb is that there's a runtime cost to any typeclass-polymorphic code. However, library authors have a lot of flexibility in eliminating this cost with GHC's rewrite rules, and in particular there is a {-# SPECIALIZE #-} pragma that can automatically create monomorphic versions of polymorphic functions and use them whenever the polymorphic function can be inferred to be used at the monomorphic type. (The price for doing this is library and executable size, I think.)

You can answer your question for any particular code segment using ghc's -ddump-simpl flag. For example, here's a short Haskell file:

vDouble :: Double
vDouble = 3
vInt = length [2..5]
main = print (vDouble + realToFrac vInt)

Without optimizations, you can see that GHC does the dictionary lookup at runtime:

Main.main :: GHC.Types.IO ()
[GblId]
Main.main =
  System.IO.print
    @ GHC.Types.Double
    GHC.Float.$fShowDouble
    (GHC.Num.+
       @ GHC.Types.Double
       GHC.Float.$fNumDouble
       (GHC.Types.D# 3.0)
       (GHC.Real.realToFrac
          @ GHC.Types.Int
          @ GHC.Types.Double
          GHC.Real.$fRealInt
          GHC.Float.$fFractionalDouble
          (GHC.List.length
             @ GHC.Integer.Type.Integer
             (GHC.Enum.enumFromTo
                @ GHC.Integer.Type.Integer
                GHC.Enum.$fEnumInteger
                (__integer 2)
                (__integer 5)))))

...the relevant bit being realToFrac @Int @Double. At -O2, on the other hand, you can see it did the dictionary lookup statically and inlined the implementation, the result being a single call to int2Double#:

Main.main2 =
  case GHC.List.$wlen @ GHC.Integer.Type.Integer Main.main3 0
  of ww_a1Oq { __DEFAULT ->
  GHC.Float.$w$sshowSignedFloat
    GHC.Float.$fShowDouble_$sshowFloat
    GHC.Show.shows26
    (GHC.Prim.+## 3.0 (GHC.Prim.int2Double# ww_a1Oq))
    (GHC.Types.[] @ GHC.Types.Char)
  }

It's also possible for a library author to choose to rewrite the polymorphic function to a call to a monomorphic one but not inline the implementation of the monomorphic one; this means that all of the possibilities you proposed (and more) are possible.

like image 142
Daniel Wagner Avatar answered Nov 18 '22 03:11

Daniel Wagner


If the compiler can "tell", at compile-time, what actual type you're using, then the method lookup happens at compile-time. Otherwise it happens at run-time. If lookup happens at compile-time, the method code may be inlined, depending on how large the method is. (This goes for regular functions too: If the compiler knows which function you're calling, it will inline it if that function is "small enough".)

Consider, for example, (sum [1 .. 10]) :: Integer. Here the compiler statically knows that the list is a list of Integer takes, so it can inline the + function for Integer. On the other hand, if you do something like

foo :: Num x => [x] -> x
foo xs = sum xs - head x

then, when you call sum, the compiler doesn't know what type you're using. (It depends on what type is given to foo), so it can't do any compile-time lookup.

On the other hand, using the {-# SPECIALIZE #-} pragma, you can do something like

{-# SPECIALIZE foo:: [Int] -> Int #-}

What this does is tell the compiler to compile a special version of foo where the input is a list of Int values. This obviously means that for that version, the compiler can do all the method lookups at compile-time (and almost certainly inline them all). Now there are two versions of foo - one which works for any type and does run-time type lookups, and one that works only for Int, but is [probably] much faster.

When you call the foo function, the compiler has to decide which version to call. If the compiler can "tell", at compile-time, that you want the Int version, it will do that. If it can't "tell" what type you're going to use, it'll use the slower any-type version.

Note that you can have multiple specialisations of a single function. For example, you could do

{-# SPECIALIZE foo :: [Int] -> Int #-}
{-# SPECIALIZE foo :: [Double] -> Double #-}
{-# SPECIALIZE foo :: [Complex Double] -> Complex Double #-}

Now, whenever the compiler can tell that you're using one of these types, it'll use the version hard-coded for that type. But if the compiler can't tell what type you're using, it'll never use the specialised versions, and always the polymorphic one. (That might mean that you need to specialise the function(s) that call foo, for example.)

If you crawl around the compiler's Core output, you can probably figure out exactly what it did in any particular circumstance. You will probably go stark raving mad though...

like image 11
MathematicalOrchid Avatar answered Nov 18 '22 03:11

MathematicalOrchid