Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Confused on DataKinds extension

I learn type programming of Haskell from Basic Type Level Programming in Haskell but when it introduce DataKinds extension, there are something seems confusing from the example:

{-# LANGUAGE DataKinds #-}
data Nat = Zero | Succ Nat

Now, Nat is promoted to Kind, it is okay. But how about Zero and Succ?

I try to get some information from GHCi, so I type:

:kind Zero

it gives

Zero :: Nat

that is okay, Zero is a type has kind Nat, right? and I try:

:type Zero

it still gives:

Zero :: Nat

that means Zero has type Nat, that is impossible, since Nat is a kind not type, right? Does Nat is both a type and a kind ??

And other confusing thing is, above blog also mentioned that, when create Nat kind, there are two new types: 'Zero and 'Succ are created automatically. When I try it from GHCi again:

:kind 'Zero

gives

'Zero :: Nat

and

:type 'Zero

gives

 Syntax error on 'Zero

Okay, it proves that 'Zero is a type. but what is the purpose of creating 'Zero and 'Succ'??

like image 836
JoeChoi Avatar asked Dec 17 '18 09:12

JoeChoi


1 Answers

In unextended Haskell, the declaration

data A = B

defines two new entities, one each at the computation and type level:

  1. At the type level, a new base type named A (of kind *), and
  2. At the computation level, a new base computation named B (of type A).

When you turn on DataKinds, the declaration

data A = B

now defines four new entities, one at the computation level, two at the type level, and one at the kind level:

  1. At the kind level, a new base kind A,
  2. At the type level, a new base type 'B (of kind A),
  3. At the type level, a new base type A (of kind *), and
  4. At the computation level, a new base computation B (of type A).

This is a strict superset of what we had before: the old (1) is now (3), and the old (2) is now (4).

These new entities explain the following interactions you described:

:type Zero
Zero :: Nat
:kind 'Zero
'Zero :: Nat
:type 'Zero
Syntax error on 'Zero

I think it's clear how it explains the first two. The last one it explains because 'Zero is a type-level thing -- you can't ask for the type of a type, only the type of a computation!

Now, in Haskell, every place where a name occurs, it is clear from the surrounding syntax whether that name is intended to be a computation-level name, a type-level name, or a kind-level name. For this reason, it is somewhat annoying to have to include the tick mark in 'B at the type level -- after all, the compiler knows we're at the type level and therefore can't be referring to the unlifted computation-level B. So for convenience, you are permitted to leave off the tick mark when that is unambiguous.

From now on, I will distinguish between the "back end" -- where there are only the four entities described above and which are always unambiguous -- and the "surface syntax", which is the stuff you type into a file and pass off to GHC that includes ambiguity but is more convenient. Using this terminology, in the surface syntax, one may write the following things, with the following meanings:

Surface syntax    Level           Back end
Name              computation     Name
Name              type            Name if that exists; 'Name otherwise
'Name             type            'Name
Name              kind            Name
---- all other combinations ----  error

This explains the first interaction you had (and the only one left unexplained above):

:kind Zero
Zero :: Nat

Because :kind must be applied to a type-level thing, the compiler knows the surface syntax name Zero must be a type-level thing. Since there is no type-level back end name Zero, it tries 'Zero instead, and gets a hit.

How can it be ambiguous? Well, notice above that we defined two new entities at the type level with one declaration. For simplicity of introduction, I named the new entities on the left- and right-hand side of the equation different things. But let's see what happens if we just tweak the declaration slightly:

data A = A

We still introduce four new back end entities:

  1. Kind A,
  2. Type 'A (of kind A),
  3. Type A (of kind *), and
  4. Computation A (of type A).

Whoops! There is now both an 'A and an A at the type level. If you leave off the tick mark in the surface syntax, it will use (3), and not (2) -- and you can explicitly choose (2) with the surface syntax 'A.

What's more, this doesn't have to happen all from a single declaration. One declaration may produce the ticked version and another the non-ticked version; for example

data A = B
data C = A

will introduce a type-level back end name A from the first declaration and a type-level back end name 'A from the second declaration.

like image 142
Daniel Wagner Avatar answered Oct 18 '22 05:10

Daniel Wagner