Does GHC ever unpack sum types when passing them to functions? For example, let's say that we have the following type:
data Foo
= Foo1 {-# UNPACK #-} !Int {-# UNPACK #-} !Word
| Foo2 {-# UNPACK #-} !Int
| Foo3 {-# UNPACK #-} !Word
Then I define a function that is strict in its Foo
argument:
consumeFoo :: Foo -> Int
consumeFoo x = case x of ...
At runtime, when I call consumeFoo
, what can I expect to happen? The GHC calling convention is to pass arguments in registers (or on the stack once there are too many). I can see two ways that the argument passing could go:
Foo
on the heap gets passed in as one argument.Foo
is used, one argument representing the data constructor that was used and the other two representing the possible Int
and Word
values in the data constructor.I would prefer the second representation, but I don't know if it is actually what happens. I am aware of UnpackedSumTypes landing in GHC 8.2, but it's unclear if it does what I want. If I had instead written the function as:
consumeFooAlt :: (# (# Int#, Word# #) | Int# | Word# #) -> Int
Then I would expect that evaluation (2) would be what happens. And the Unpacking section of the unpacked sums page indicates that I could do this as well:
data Wrap = Wrap {-# UNPACK #-} !Foo
consumeFooAlt2 :: Wrap -> Int
And that should also have the representation I want, I think.
So my question is, without using a wrapper type or a raw unpacked sum, how can I guarentee that a sum is unpacked into registers (or onto the stack) when I pass it as an argument to a function? If it is possible, is it something that GHC 8.0 can already do, or is it something that will only be available in GHC 8.2?
First: Guaranteed optimization and GHC don't mix well. Due to the high level it is very hard to predict the code that GHC will generate in every case. The only way to be sure is to look at the Core. If you're developing an extremely performance dependent application with GHC, then you need to become familar with Core I.
I am not aware of any optimization in GHC that does exactly what you describe. Here is an example program:
module Test where
data Sum = A {-# UNPACK #-} !Int | B {-# UNPACK #-} !Int
consumeSum :: Sum -> Int
consumeSum x = case x of
A y -> y + 1
B y -> y + 2
{-# NOINLINE consumeSumNoinline #-}
consumeSumNoinline = consumeSum
{-# INLINE produceSumInline #-}
produceSumInline :: Int -> Sum
produceSumInline x = if x == 0 then A x else B x
{-# NOINLINE produceSumNoinline #-}
produceSumNoinline :: Int -> Sum
produceSumNoinline x = if x == 0 then A x else B x
test :: Int -> Int
--test x = consumeSum (produceSumInline x)
test x = consumeSumNoinline (produceSumNoinline x)
Let's first look at what happens if we don't inline consumeSum
nor produceSum
. Here is the core:
test :: Int -> Int
test = \ (x :: Int) -> consumeSumNoinline (produceSumNoinline x)
(produced with ghc-core test.hs -- -dsuppress-unfoldings -dsuppress-idinfo -dsuppress-module-prefixes -dsuppress-uniques
)
Here, we can see that GHC (8.0 in this case) does not unbox the sum type passed as a function argument. Nothing changes if we inline either consumeSum
or produceSum
.
If we inline both however, then the following code is generated:
test :: Int -> Int
test =
\ (x :: Int) ->
case x of _ { I# x1 ->
case x1 of wild1 {
__DEFAULT -> I# (+# wild1 2#);
0# -> lvl1
}
}
What happened here is that through inlining, GHC ends up with:
\x -> case (if x == 0 then A x else B x) of
A y -> y + 1
B y -> y + 2
Which through the case-of-case (if
is just a special case
) is turned into:
\x -> if x == 0 then case (A x) of ... else case (B x) of ...
Now that is a case with a known constructor, so GHC can reduce the case at compile time ending up with:
\x -> if x == 0 then x + 1 else x + 2
So it completely eliminated the constructor.
In summary, I believe that GHC does not have any concept of an "unboxed sum" type prior to version 8.2, which also applies to function arguments. The only way to get "unboxed" sums is to get the constructor eliminated completely through inlining.
If you need such an optimization, your simplest solution is to do it yourself. I think there are actually many ways to achieve this, but one is:
data Which = Left | Right | Both
data Foo = Foo Which Int Word
The unpacking of any fields of this type is completely irrelevant to the question of the 'shape of the representation', which is what you are really asking about. Enumerations are already highly optimized - only one value for every constructor is ever created - so the addition of this field doesn't affect performance. The unpacked representation of this type is precisely what you want - one word for Which
constructor and one for each field.
If you write your functions in the proper way, you get the proper code:
data Which = Lft | Rgt | Both
data Foo = Foo Which {-# UNPACK #-} !Int {-# UNPACK #-} !Word
consumeFoo :: Foo -> Int
consumeFoo (Foo w l r) =
case w of
Lft -> l
Rgt -> fromIntegral r
Both -> l + fromIntegral r
The generated core is quite obvious:
consumeFoo :: Foo -> Int
consumeFoo =
\ (ds :: Foo) ->
case ds of _ { Foo w dt dt1 ->
case w of _ {
Lft -> I# dt;
Rgt -> I# (word2Int# dt1);
Both -> I# (+# dt (word2Int# dt1))
}
}
However, for simple programs such as:
consumeFoos = foldl' (+) 0 . map consumeFoo
This optimization makes no difference. As is indicated in the other answer, the inner function consumeFoo
is just inlined:
Rec {
$wgo :: [Foo] -> Int# -> Int#
$wgo =
\ (w :: [Foo]) (ww :: Int#) ->
case w of _ {
[] -> ww;
: y ys ->
case y of _ {
Lft dt -> $wgo ys (+# ww dt);
Rgt dt -> $wgo ys (+# ww (word2Int# dt));
Both dt dt1 -> $wgo ys (+# ww (+# dt (word2Int# dt1)))
}
}
end Rec }
vs.
Rec {
$wgo :: [Foo] -> Int# -> Int#
$wgo =
\ (w :: [Foo]) (ww :: Int#) ->
case w of _ {
[] -> ww;
: y ys ->
case y of _ { Foo w1 dt dt1 ->
case w1 of _ {
Lft -> $wgo ys (+# ww dt);
Rgt -> $wgo ys (+# ww (word2Int# dt1));
Both -> $wgo ys (+# ww (+# dt (word2Int# dt1)))
}
}
}
end Rec }
Which, in almost every case when working with low-level, unpacked data, is the goal anyways, as most of your functions are small and cost little to inline.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With