Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The pattern with functions like `bool`, `either`, etc

I recently learned about GADTs and their notation:

E.g.

data Maybe a where
  Nothing :: Maybe a
  Just    :: a -> Maybe a

data Either a b where
  Left  :: a -> Either a b
  Right :: b -> Either a b

data Bool where
  False :: Bool
  True  :: Bool

Now I noticed a similiarity to functions like bool, and either, which is basically just like the GADT definition:

  1. taking every line as an argument
  2. replacing the actual type with the next letter of the alphabet
  3. and finally returning a function Type -> (the letter of step 2)

E.g.

maybe  :: b -> (a -> b) -> Maybe a -> b
either :: (a -> c) -> (b -> c) -> Either a b -> c
bool   :: a -> a -> Bool -> a

This also includes foldr, but I noticed that e.g. Tuple doesn't have such a function, though you could easily define it:

tuple :: (a -> b -> c) -> (a,b) -> c
tuple f (x,y) = f x y

What is this pattern? It seems to me these functions alleviate the need for pattern matching (because they give a general way for each case) and thus every function operating on the type can be defined in terms of this function.

like image 876
hgiesel Avatar asked Jul 20 '17 21:07

hgiesel


People also ask

What is a pattern in Haskell?

We use pattern matching in Haskell to simplify our codes by identifying specific types of expression. We can also use if-else as an alternative to pattern matching. Pattern matching can also be seen as a kind of dynamic polymorphism where, based on the parameter list, different methods can be executed.

How does pattern matching work in Haskell?

Pattern matching consists of specifying patterns to which some data should conform and then checking to see if it does and deconstructing the data according to those patterns. When defining functions, you can define separate function bodies for different patterns.

What does in do in Haskell?

in goes along with let to name one or more local expressions in a pure function.

What does H :: t mean in OCaml?

What does H :: t mean in OCaml? The pattern h :: t matches any non-empty list, calls the head of the list h (one element, the first one), and the tail of the list t (zero or more elements after the first one).


1 Answers

First, the types you mention are not really GADTs, they are plain ADTs, since the return type of each constructor is always T a. A proper GADT would be something like

data T a where
   K1 :: T Bool  -- not T a

Anyway, the technique you mention is a well known method to encode algebraic data types into (polymorphic) functions. It goes under many names, like Church encoding, Boehm-Berarducci encoding, endcoding as a catamorphism, etc. Sometimes the Yoneda lemma is used to justify this approach, but there's no need to understand the category-theoretic machinery to understand the method.

Basically, the idea is the following. All the ADTs can be generated by

  • product types (,) a b
  • sum types Either a b
  • arrow types a -> b
  • unit type ()
  • void type Void (rarely used in Haskell, but theoretically nice)
  • variables (if the type bing defined has parameters)
  • possibly, basic types (Integer, ...)
  • type level-recursion

Type level recursion is used when some value constructor takes the type which is being defined as an argument. The classic example is Peano-style naturals:

data Nat where
   O :: Nat
   S :: Nat -> Nat
     -- ^^^ recursive!

Lists are also common:

data List a where
   Nil :: List a
   Cons :: a -> List a -> List a
             -- ^^^^^^ recursive!

Types like Maybe a, pairs, etc. are non recursive.

Note that each ADT, recursive or not, can be reduced to a single constructor with a sigle argument, by summing (Either) over all the constructors, and multiplying all the arguments. For instance, Nat is isomorphic to

data Nat1 where
  O1 :: () -> Nat
  S1 :: Nat -> Nat

which is isomorphic to

data Nat2 where K2 :: (Either () Nat) -> Nat

Lists become

data List1 a where K1 :: (Either () (a, List a)) -> List a

The steps above make use of the algebra of types, which makes the sum and products of types obey the same rules as high school algebra, while a -> b behaves like the exponential b^a.

Hence, we can write any ADT in the form

-- pseudo code
data T where
   K :: F T -> T
type F k = .....

For instance

type F_Nat k = Either () k      -- for T = Nat
type F_List_a k = Either () (a, k) -- for T = List a

(Note that the latter type function F depends on a, but it's not important right now.)

Non recursive types will not use k:

type F_Maybe_a k = Either () a     -- for T = Maybe a

Note that constructor K above makes the type T isomorphic to F T (let's ignore the lifting / extra bottom introduced by it). Essentially, we have that

Nat ~= F Nat = Either () Nat
List a ~= F (List a) = Either () (a, List a)
Maybe a ~= F (Maybe a) = Either () a

We can even formalize this further by abstracting from F

newtype Fix f = Fix { unFix :: f (Fix f) }

By definition Fix F will now be isomorphic to F (Fix F). We could let

type Nat = Fix F_Nat

(In Haskell, we need a newtype wrapper around F_Nat, which I omit for clarity.)

Finally, the general encoding, or catamorphism, is:

cata :: (F k -> k) -> Fix F -> k

This assumes that F is a functor.

For Nat, we get

cata :: (Either () k -> k) -> Nat -> k
-- isomorphic to
cata :: (() -> k, k -> k) -> Nat -> k
-- isomorphic to
cata :: (k, k -> k) -> Nat -> k
-- isomorphic to
cata :: k -> (k -> k) -> Nat -> k

Note the "high school albegra" steps, where k^(1+k) = k^1 * k^k, hence Either () k -> k ~= (() -> k, k -> k).

Note that we get two arguments, k and k->k which correspond to O and S. This is not a coincidence -- we summed over all the constructors. This means that cata expects to be passed the value of type k which "plays the role of O" there, and then the value of type k -> k which plays the role of S.

More informally, cata is telling us that, if we want to map a natural in k, we only have to state what is the "zero" inside k and how to take the "successor" in k, and then every Nat can be mapped consequently.

For lists we get:

cata :: (Either () (a, k) -> k) -> List a -> k
-- isomorphic to
cata :: (() -> k, (a, k) -> k) -> List a -> k
-- isomorphic to
cata :: (k, a -> k -> k) -> List a -> k
-- isomorphic to
cata :: k -> (a -> k -> k) -> List a -> k

which is foldr.

Again, this is cata telling us that, if we state how to take the "empty list" in k and to "cons" a and k inside k, we can map any list in k.

Maybe a is the same:

cata :: (Either () a -> k) -> Maybe a -> k
-- isomorphic to
cata :: (() -> k, a -> k) -> Maybe a -> k
-- isomorphic to
cata :: (k, a -> k) -> Maybe a -> k
-- isomorphic to
cata :: k -> (a -> k) -> Maybe a -> k

If we can map Nothing in k, and perform Just mapping a in k, the we can map any Maybe a in k.

If we try to apply the same approach to Bool and (a,b) we reach the functions which were posted in the questions.

More advanced theoretical topics to look up:

  • (initial) F-algebras in category theory
  • eliminators / recursors / induction principles in type theory (these can be applied to GADTs as well)
like image 136
chi Avatar answered Sep 28 '22 09:09

chi