Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why isn't there a simple syntax for coproduct types in Haskell?

Product types in Haskell are easily definable:

data Person String String 

is a product of two types. The coproduct of two types is

type Shape=Either Circle Rectangle

But whereas the product is easily extendible to three or more types, it doesn't seem so simple for coproducts. Is there a theoretical rationale behind this difference or is the reason purely technical?

like image 209
Mozibur Ullah Avatar asked Jan 10 '13 02:01

Mozibur Ullah


1 Answers

data Apple = Gala | Fuji | PinkLady
data Orange = Navel | Blood
data Berry = Blueberry | Cranberry | Raspberry

data Fruit = Apple Apple
           | Orange Orange
           | Berry Berry

Here, Fruit is a coproduct of Apple, Orange, and Berry1.

Note that non-tagged unions are not coproducts.

1: Well, kinda.  Fruit also contains an extra element, . See below.

Response to edited question

data Shape = Either Circle Rectangle

You probably mean:

type Shape = Either Circle Rectangle

If you use data, you've defined a product type with a single constructor named Either. This is perfectly legal. If you use type, you've defined Shape as another name for Either Circle Rectangle, which is a coproduct of Circle and Rectangle.

Hask versus Set

Let's call the category of types and functions in Haskell, Hask. This is the usual name for it. It does meet the definition for category, assuming you don't look too closely at these finite things we call computers.

And let's compare Hask to the category Set. This is natural, because Hask is a concrete category. Compare the (,) type constructor in Hask with the cartesian product in Set. If we want the product of Int and Int, we get:

  • ⊥ ∈ (Int, Int) (in Hask), but
  • ⊥ ∉ Int ⨯ Int (in Set).

So you can see that the type constructor (,) is not the same as the cartesian product, because it contains one extra member, . We can repeat the argument for the disjoint union:

  • ⊥ ∈ Either Int Int (in Hask), but
  • ⊥ ∉ Int ⊔ Int (in Set).

In each case, the structure in Hask contains an additional element, , that the equivalent structure in Set would not have.

Hask versus Pointed sets

Hask is also not the category of pointed sets. First of all, Hask contains morphisms that are not pointed set morphisms.

  1. For every type T in Hask, we can construct a function T -> T such that f x = ⊥ for all x. Therefore, must be the basepoint, if objects in Hask are pointed sets. Note that all such f are strict functions.

  2. However, let g be any lazy (the correct term here is actually "non-strict") function. By the definition of strictness, g ⊥ ≠ ⊥. However, with #1, this contradicts the premise that Hask is the category of pointed sets.

Additionally, the product and coproduct structures are different, in a similar manner to the way the structures are different from the structures of Set. For products,

  • (⊥, ⊥) ∈ (Int, Int) (in Hask), but
  • (⊥, ⊥) ∉ Int ⊗ Int (in Pointed Set).

This follows from the problem with morphisms: in pointed sets, all functions are strict -- this includes constructors such as (,). The coproduct has the same problem:

  • Left ⊥ ∈ Either Int Int (in Hask), but
  • Left ⊥ ∉ Int ⊕ Int (in Pointed Set).

Conclusion

So, Set and Pointed Set are both not quite equal to the category Hask. As noted in the Hask page on the Haskell Wiki, the "product" and "coproduct" types in Haskell simply don't meet the definition of categorical products and coproducts. So strictly speaking, products and coproducts don't exist in Haskell.

That's the bad news. There is good news.

  1. Consider all strict functions and strict constructors in Hask. The result is a subcategory of Hask which is also a subcategory of Pointed Set. This subcategory is a cartesian closed category.

  2. Consider all total functions in Hask, and consider two functions as the same morphism if they produce the same output for every input other than . (These outputs must necessarily not be by the definition of "total".) The result is a subcategory of Set. This subcategory is a cartesian closed category.

So you can still work with cartesian closed categories as long as you play by the right set of rules. You even get to choose from two different categories to work with! However, if you play by these rules, then you are working with a subset of Haskell.

There is a final bit of good news. Strict functions can be modified into lazy functions without changing the outputs of the program as a whole, assuming the strict version of the program terminates. So you can pretend that doesn't exist and get some work done with category theory, but still write programs that exploit lazy evaluation.

Summary for the lazy

Pretending that Hask has products and coproducts won't get you into trouble.

like image 84
Dietrich Epp Avatar answered Sep 28 '22 02:09

Dietrich Epp