Now that we have injective type families, is there any remaining use case for using data families over type families?
Looking at past StackOverflow questions about data families, there is this question from a couple years ago discussing the difference between type families and data families, and this answer about use cases of data families. Both say that the injectivity of data families is their greatest strength.
Looking at the docs on data families, I see reason not to rewrite all uses of data families using injective type families.
For example, say I have a data family (I've merged some examples from the docs to try to squeeze in all the features of data families)
data family G a b
data instance G Int Bool = G11 Int | G12 Bool deriving (Eq)
newtype instance G () a = G21 a
data instance G [a] b where
G31 :: c -> G [Int] b
G32 :: G [a] Bool
I might as well rewrite it as
type family G a b = g | g -> a b
type instance G Int Bool = G_Int_Bool
type instance G () a = G_Unit_a a
type instance G [a] b = G_lal_b a b
data G_Int_Bool = G11 Int | G12 Bool deriving (Eq)
newtype G_Unit_a a = G21 a
data G_lal_b a b where
G31 :: c -> G_lal_b [Int] b
G32 :: G_lal_b [a] Bool
It goes without saying that associated instances for data families correspond to associated instances with type families in the same way. Then is the only remaining difference that we have less things in the type-namespace?
As a followup, is there any benefit to having less things in the type-namespace? All I can think of is that this will become debugging hell for someone playing with this on ghci
- the types of the constructors all seem to indicate that the constructors are all under one GADT...
Data families are the indexed form of data and newtype definitions. Type synonym families are the indexed form of type synonyms. Each of these flavors can be defined in a standalone manner or associated with a type class.
A closed type family has all of its equations defined in one place and cannot be extended, whereas an open family can have instances spread across modules. The advantage of a closed family is that its equations are tried in order, similar to a term-level function definition.
From HaskellWiki. Wikipedia says, "In type theory, a kind is the type of a type constructor or, less commonly, the type of a higher-order type operator.
type family T a = r | r -> a
data family D a
An injective type family T
satisfies the injectivity axiom
if
T a ~ T b
thena ~ b
But a data family satisfies the much stronger generativity axiom
if
D a ~ g b
thenD ~ g
anda ~ b
(If you like: Because the instances of D
define new types that are different from any existing types.)
In fact D
itself is a legitimate type in the type system, unlike a type family like T
, which can only ever appear in a fully saturated application like T a
. This means
D
can be the argument to another type constructor, like MaybeT D
. (MaybeT T
is illegal.)
You can define instances for D
, like instance Functor D
. (You can't define instances for a type family Functor T
, and it would be unusable anyway because instance selection for, e.g., map :: Functor f => (a -> b) -> f a -> f b
relies on the fact that from the type f a
you can determine both f
and a
; for this to work f
cannot be allowed to vary over type families, even injective ones.)
You're missing one other detail - data families create new types. Type families can only refer to other types. In particular, every instance of a data family declares new constructors. And it's nicely generic. You can create a data instance with newtype instance
if you want newtype semantics. Your instance can be a record. It can have multiple constructors. It can even be a GADT if you want.
It's exactly the difference between the type
and data
/newtype
keywords. Injective type families don't give you new types, rendering them useless in the case where you need that.
I understand where you're coming from. I had this same issue with the difference initially. Then I finally ran into a use case where they're useful, even without a type class getting involved.
I wanted to write an api for dealing with mutable cells in a few different contexts, without using classes. I knew I wanted to do it with a free monad with interpreters in IO
, ST
, and maybe some horrible hacks with unsafeCoerce
to even go so far as shoehorning it into State
. This wasn't for any practical purpose, of course - I was just exploring API designs.
So I had something like this:
data MutableEnv (s :: k) a ...
newRef :: a -> MutableEnv s (Ref s a)
readRef :: Ref s a -> MutableEnv s a
writeRef :: Ref s a -> a -> MutableEnv s ()
The definition of MutableEnv
wasn't important. Just standard free/operational monad stuff with constructors matching the three functions in the api.
But I was stuck on what to define Ref as. I didn't want some sort of class, I wanted it to be a concrete type as far as the type system was concerned.
Then late one night I was out for a walk and it hit me - what I essentially want is a type whose constructors are indexed by an argument type. But it had to be open, unlike a GADT - new interpreters could be added at will. And then it hit me. That's exactly what a data family is. An open, type-indexed family of data values. I could complete the api with just the following:
data family Ref (s :: k) :: * -> *
Then, dealing with the underlying representation for a Ref was no big deal. Just create a data instance (or newtype instance, more likely) whenever an interpreter for MutableEnv
is defined.
This exact example isn't really useful. But it clearly illustrates something data families can do that injective type families can't.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With