Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Restrictions of unboxed types

I wonder why unboxed types in Haskell have these restrictions:

  1. You cannot define a newtype for unboxed type:

    newtype Vec = Vec (# Float#, Float# #)
    

    but you can define type synonim:

    type Vec = (# Float#, Float# #)
    
  2. Type families can't return unboxed type:

    type family Unbox (a :: *) :: # where
        Unbox Int    = Int#
        Unbox Word   = Word#
        Unbox Float  = Float#
        Unbox Double = Double#
        Unbox Char   = Char#
    

Are there some fundamental reasons behind this, or it's just because no one asked for this features?

like image 338
Alexey Vagarenko Avatar asked Nov 17 '15 09:11

Alexey Vagarenko


2 Answers

Parametric polymorphism in Haskell relies on the fact that all values of t :: * types are uniformly represented as a pointer to a runtime object. Thus, the same machine code works for all instantiations of polymorphic values.

Contrast polymorphic functions in Rust or C++. For example, the identity function there still has type analoguous to forall a. a -> a, but since values of different a types may have different sizes, the compilers have to generate different code for each instatiation. This also means that we can't pass polymorphic functions around in runtime boxes:

data Id = Id (forall a. a -> a)

since such a function would have to work correctly for arbitrary-sized objects. It requires some additional infrastructure to allow this feature, for example we could require that a runtime forall a. a -> a function takes extra implicit arguments that carry information about the size and constructors/destructors of a values.

Now, the problem with newtype Vec = Vec (# Float#, Float# #) is that even though Vec has kind *, runtime code that expects values of some t :: * can't handle it. It's a stack-allocated pair of floats, not a pointer to a Haskell object, and passing it to code expecting Haskell objects would result in segfaults or errors.

In general (# a, b #) isn't necessarily pointer-sized, so we can't copy it into pointer-sized data fields.

Type families returning # types are disallowed for related reasons. Consider the following:

type family Foo (a :: *) :: # where
  Foo Int = Int#
  Foo a   = (# Int#, Int# #)

data Box = forall (a :: *). Box (Foo a)

Our Box is not representable runtime, since Foo a has different sizes for different a-s. Generally, polymorphism over # would require generating different code for different instantiations, like in Rust, but this interacts badly with regular parametric polymorphism and makes runtime representation of polymorphic values difficult, so GHC doesn't bother with any of this.

(Not saying though that a usable implementation couldn't possibly be devised)

like image 157
András Kovács Avatar answered Oct 10 '22 21:10

András Kovács


A newtype would allow one to define class instances

instance C Vec where ...

which can not be defined for unboxed tuples. Type synonyms instead do not offer such functionality.

Also, Vec would not be a boxed type. This means that you can no longer instantiate type variables with Vec in general, unless their kind allows it. For instance [Vec] should be disallowed. The compiler should keep track of "regular" newtypes and "unboxed" newtypes in some way. This will have, I think, the only benefit of allowing the data constructor Vec to wrap unboxed values at compile time (since it is removed at runtime). This would probably be not enough useful to justify making the necessary changes to the type inference engine, I guess.

like image 4
chi Avatar answered Oct 10 '22 20:10

chi