Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Chaining Haskell Functions in data types

Tags:

haskell

ghc

Lets say I have the following:

data FuncAndValue v res = FuncAndValue (v -> res) v

chain :: (res -> new_res) -> FuncAndValue v res -> FuncAndValue v new_res
chain new_f (FuncAndValue old_f v) = FuncAndValue (new_f . old_f) v  

Is GHC likely to be able to combine functions new_f and old_f into a single function through inlining?

Basically, does storing functions in data types in anyway inhibit optimizations.

I'd like GHC to be easily able to compose chains of functions into one (i.e. so a "sum" on my structure doesn't involve repeated calls to a thunk that represents (+) and instead just inlines the (+) so it runs like a for loop. I'm hoping storing functions in data types and then accessing them later doesn't inhibit this.

like image 915
Clinton Avatar asked Oct 06 '22 10:10

Clinton


1 Answers

Is GHC likely to be able to combine functions new_f and old_f into a single function through inlining?

Yes, if it could do the same without the intervening FuncAndValue. Of course the unfoldings of the functions need to be available, or there wouldn't be any chance of inlining anyway. But if there is a chance, wrapping the function(s) in a FuncAndValue makes little difference if any.

But let's ask GHC itself. First the type and a very simple chaining:

module FuncAndValue where

data FuncAndValue v res = FuncAndValue (v -> res) v

infixr 7 `chain`

chain :: (res -> new_res) -> FuncAndValue v res -> FuncAndValue v new_res
chain new_f (FuncAndValue old_f v) = FuncAndValue (new_f . old_f) v

apply :: FuncAndValue v res -> res
apply (FuncAndValue f x) = f x

trivia :: FuncAndValue Int (Int,Int)
trivia = FuncAndValue (\x -> (2*x - 1, 3*x + 2)) 1

composed :: FuncAndValue Int Int
composed = chain (uncurry (+)) trivia

and (the interesting part of) the core we get for trivia and composed:

FuncAndValue.trivia1 =
  \ (x_af2 :: GHC.Types.Int) ->
    (case x_af2 of _ { GHC.Types.I# y_agp ->
     GHC.Types.I# (GHC.Prim.-# (GHC.Prim.*# 2 y_agp) 1)
     },
     case x_af2 of _ { GHC.Types.I# y_agp ->
     GHC.Types.I# (GHC.Prim.+# (GHC.Prim.*# 3 y_agp) 2)
     })

FuncAndValue.composed2 =
  \ (x_agg :: GHC.Types.Int) ->
    case x_agg of _ { GHC.Types.I# y_agp ->
    GHC.Types.I#
      (GHC.Prim.+#
         (GHC.Prim.-# (GHC.Prim.*# 2 y_agp) 1)
         (GHC.Prim.+# (GHC.Prim.*# 3 y_agp) 2))
    }

Inlined fair enough, no (.) to be seen. The two cases from trivia have been joined so that we have only one in composed. Unless somebody teaches GHC enough algebra to simplify \x -> (2*x-1) + (3*x+2) to \x -> 5*x + 1, that's as good as you can hope. apply composed is reduced to 6 at compile time, even in a separate module.

But that was very simple, let's give it a somewhat harder nut to crack.

An inlinable version of until (the current definition of until is recursive, so GHC doesn't inline it),

module WWUntil where

wwUntil :: (a -> Bool) -> (a -> a) -> a -> a
wwUntil p f = recur
  where
    recur x
        | p x       = x
        | otherwise = recur (f x)

Another simple function it its own module,

collatzStep :: Int -> Int
collatzStep n
    | n .&. 1 == 0  = n `unsafeShiftR` 1
    | otherwise     = 3*n + 1

and finally, the nut

module Hailstone (collatzLength, hailstone) where

import FuncAndValue
import CollatzStep
import WWUntil

data P = P {-# UNPACK #-} !Int {-# UNPACK #-} !Int

fstP :: P -> Int
fstP (P x _) = x

sndP :: P -> Int
sndP (P _ y) = y

hailstone :: Int -> FuncAndValue Int Int
hailstone n = sndP `chain` wwUntil ((== 1) . fstP) (\(P n k) -> P (collatzStep n) (k+1))
                   `chain` FuncAndValue (\x -> P x 0) n

collatzLength :: Int -> Int
collatzLength = apply . hailstone

I have helped the strictness analyser a bit by using a strict pair. With the vanilla (,) the second component would be unboxed and reboxed after adding 1 in each step, and I just can't bear such waste ;) But otherwise there's no relevant difference.

And (the interesting part of) the core GHC generates:

Rec {
Hailstone.$wrecur [Occ=LoopBreaker]
  :: GHC.Prim.Int#
     -> GHC.Prim.Int# -> (# GHC.Prim.Int#, GHC.Prim.Int# #)
[GblId, Arity=2, Caf=NoCafRefs, Str=DmdType LL]
Hailstone.$wrecur =
  \ (ww_sqq :: GHC.Prim.Int#) (ww1_sqr :: GHC.Prim.Int#) ->
    case ww_sqq of wild_Xm {
      __DEFAULT ->
        case GHC.Prim.word2Int#
               (GHC.Prim.and# (GHC.Prim.int2Word# wild_Xm) (__word 1))
        of _ {
          __DEFAULT ->
            Hailstone.$wrecur
              (GHC.Prim.+# (GHC.Prim.*# 3 wild_Xm) 1) (GHC.Prim.+# ww1_sqr 1);
          0 ->
            Hailstone.$wrecur
              (GHC.Prim.uncheckedIShiftRA# wild_Xm 1) (GHC.Prim.+# ww1_sqr 1)
        };
      1 -> (# 1, ww1_sqr #)
    }
end Rec }

lvl_rsz :: GHC.Types.Int -> GHC.Types.Int
[GblId, Arity=1, Caf=NoCafRefs]
lvl_rsz =
  \ (x_iog :: GHC.Types.Int) ->
    case x_iog of _ { GHC.Types.I# tpl1_B4 ->
    case Hailstone.$wrecur tpl1_B4 0 of _ { (# _, ww2_sqH #) ->
    GHC.Types.I# ww2_sqH
    }
    }

and that's exactly what you get without FuncAndValue. Everything inlined nicely, a beautiful tight loop.

Basically, does storing functions in data types in anyway inhibit optimizations.

If you wrap the function under enough layers, yes. But it's the same with other values.

like image 198
Daniel Fischer Avatar answered Oct 10 '22 01:10

Daniel Fischer