I wonder why unboxed types in Haskell have these restrictions:
You cannot define a newtype for unboxed type:
newtype Vec = Vec (# Float#, Float# #)
but you can define type synonim:
type Vec = (# Float#, Float# #)
Type families can't return unboxed type:
type family Unbox (a :: *) :: # where
Unbox Int = Int#
Unbox Word = Word#
Unbox Float = Float#
Unbox Double = Double#
Unbox Char = Char#
Are there some fundamental reasons behind this, or it's just because no one asked for this features?
Parametric polymorphism in Haskell relies on the fact that all values of t :: *
types are uniformly represented as a pointer to a runtime object. Thus, the same machine code works for all instantiations of polymorphic values.
Contrast polymorphic functions in Rust or C++. For example, the identity function there still has type analoguous to forall a. a -> a
, but since values of different a
types may have different sizes, the compilers have to generate different code for each instatiation. This also means that we can't pass polymorphic functions around in runtime boxes:
data Id = Id (forall a. a -> a)
since such a function would have to work correctly for arbitrary-sized objects. It requires some additional infrastructure to allow this feature, for example we could require that a runtime forall a. a -> a
function takes extra implicit arguments that carry information about the size and constructors/destructors of a
values.
Now, the problem with newtype Vec = Vec (# Float#, Float# #)
is that even though Vec
has kind *
, runtime code that expects values of some t :: *
can't handle it. It's a stack-allocated pair of floats, not a pointer to a Haskell object, and passing it to code expecting Haskell objects would result in segfaults or errors.
In general (# a, b #)
isn't necessarily pointer-sized, so we can't copy it into pointer-sized data fields.
Type families returning #
types are disallowed for related reasons. Consider the following:
type family Foo (a :: *) :: # where
Foo Int = Int#
Foo a = (# Int#, Int# #)
data Box = forall (a :: *). Box (Foo a)
Our Box
is not representable runtime, since Foo a
has different sizes for different a
-s. Generally, polymorphism over #
would require generating different code for different instantiations, like in Rust, but this interacts badly with regular parametric polymorphism and makes runtime representation of polymorphic values difficult, so GHC doesn't bother with any of this.
(Not saying though that a usable implementation couldn't possibly be devised)
A newtype
would allow one to define class instances
instance C Vec where ...
which can not be defined for unboxed tuples. Type synonyms instead do not offer such functionality.
Also, Vec
would not be a boxed type. This means that you can no longer instantiate type variables with Vec
in general, unless their kind allows it. For instance [Vec]
should be disallowed. The compiler should keep track of "regular" newtypes and "unboxed" newtypes in some way. This will have, I think, the only benefit of allowing the data constructor Vec
to wrap unboxed values at compile time (since it is removed at runtime). This would probably be not enough useful to justify making the necessary changes to the type inference engine, I guess.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With