I just noticed while trying to learn to read GHC Core, that the
automatically derived Eq
instance for enum-style data types such as
data EType = ETypeA | ETypeB | ETypeC | ETypeD
| ETypeE | ETypeF | ETypeG | ETypeH
deriving (Eq)
seems to be transformed into a O(N)-like lookup when looking at GHC's core representation:
$fEqEType_$c== =
\ (a_ahZ :: EType) (b_ai0 :: EType) ->
case a_ahZ of _ {
ETypeA ->
case b_ai0 of _ {
ETypeA -> True;
ETypeB -> False;
ETypeC -> False;
ETypeD -> False;
ETypeE -> False;
ETypeF -> False;
ETypeG -> False;
ETypeH -> False
};
ETypeB -> case b_ai0 of _ {__DEFAULT -> False; ETypeB -> True};
ETypeC -> case b_ai0 of _ {__DEFAULT -> False; ETypeC -> True};
ETypeD -> case b_ai0 of _ {__DEFAULT -> False; ETypeD -> True};
ETypeE -> case b_ai0 of _ {__DEFAULT -> False; ETypeE -> True};
ETypeF -> case b_ai0 of _ {__DEFAULT -> False; ETypeF -> True};
ETypeG -> case b_ai0 of _ {__DEFAULT -> False; ETypeG -> True};
ETypeH -> case b_ai0 of _ {__DEFAULT -> False; ETypeH -> True}
}
Am I misinterpreting the GHC core output? Shouldn't algebraic data types provide an integer id for each constructor, which could then be compared directly in O(1)? Also, why does the first case clause for ETypeA
not make use of __DEFAULT
as the other clauses do?
update:
As per suggestion by Simon Marlow, I addad a 9th constructor ETypeI
, and then GHC switched to using dataToOtag#
:
$fEqEType_$c/= =
\ (a_ahS :: EType) (b_ahT :: EType) ->
case dataToTag# @ EType a_ahS of a#_ahQ {
__DEFAULT ->
case dataToTag# @ EType b_ahT of b#_ahR {
__DEFAULT ->
case ==# a#_ahQ b#_ahR of _ {
False -> True; True -> False
}
}
}
For me, this adds the question as to what the trade-offs between GHC core's case
and use of dataToTag#
are, and why this particular cut-off of 9 constructors for using dataToTag#
is implemented in GHC.
Equality comparison of EType
is O(1) because the case
construct is O(1).
There might or might not be an integer tag for constructors. There are several low level representation choices, so the Core generated works for all of them. That said, you can always make an integer tag for constructors, and that's how I usually implement the derived comparison when I write Haskell compilers.
I have no idea why ETypeA
gets a different treatment. Looks like bug.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With