Why do We Need Sum Types?

Tags:

type-theory

Imagine a language which doesn't allow multiple value constructors for a data type. Instead of writing

data Color = White | Black | Blue

we would have

data White = White
data Black = Black
data Blue = Black
type Color = White :|: Black :|: Blue

where :|: (here it's not | to avoid confusion with sum types) is a built-in type union operator. Pattern matching would work in the same way

show :: Color -> String
show White = "white"
show Black = "black"
show Blue = "blue"

As you can see, in contrast to coproducts it results in a flat structure so you don't have to deal with injections. And, unlike sum types, it allows to randomly combine types resulting in greater flexibility and granularity:

type ColorsStartingWithB = Black :|: Blue

I believe it wouldn't be a problem to construct recursive data types as well

data Nil = Nil
data Cons a = Cons a (List a)
type List a = Cons a :|: Nil

I know union types are present in TypeScript and probably other languages, but why did the Haskell committee chose ADTs over them?

672

asked Nov 15 '16 22:11

shock_one

Video Answer

1 Answers

Haskell's sum type is very similar to your :|:.

The difference between the two is that the Haskell sum type | is a tagged union, while your "sum type" :|: is untagged.

Tagged means every instance is unique - you can distunguish Int | Int from Int (actually, this holds for any a):

data EitherIntInt = Left Int | Right Int

In this case: Either Int Int carries more information than Int because there can be a Left and Right Int.

In your :|:, you cannot distinguish those two:

type EitherIntInt = Int :|: Int

How do you know if it was a left or right Int?

See the comments for an extended discussion of the section below.

Tagged unions have another advantage: The compiler can verify whether you as the programmer handled all cases, which is implementation-dependent for general untagged unions. Did you handle all cases in Int :|: Int? Either this is isomorphic to Int by definition or the compiler has to decide which Int (left or right) to choose, which is impossible if they are indistinguishable.

Consider another example:

type (Integral a, Num b) => IntegralOrNum a b = a :|: b    -- untagged
data (Integral a, Num b) => IntegralOrNum a b = Either a b -- tagged

What is 5 :: IntegralOrNum Int Double in the untagged union? It is both an instance of Integral and Num, so we can't decide for sure and have to rely on implementation details. On the other hand, the tagged union knows exactly what 5 should be because it is branded with either Left or Right.

As for naming: The disjoint union in Haskell is a union type. ADTs are only a means of implementing these.

answered Oct 09 '22 10:10

ThreeFx

Related questions
                            
                                Haskell: where clause referencing bound variables in lambda
                            
                                Haskell concurrency - is forkIO really nondeterministic?
                            
                                What does uncurry ($) do?
                            
                                What's the difference between a lens and a partial lens?
                            
                                Performance problem with Euler problem and recursion on Int64 types
                            
                                zlib build error with GHC
                            
                                Input checks in Haskell data constructors
                            
                                The "handle" function and Real World Haskell
                            
                                Haskell - loop over user input
                            
                                Why does the pointfree version of this function look like this?
                            
                                Rounding to specific number of digits in Haskell
                            
                                Can Haskell's Control.Concurrent.Async.mapConcurrently have a limit?
                            
                                Filtering Nothing and unpack Just
                            
                                How do I get the sums of the digits of a large number in Haskell?
                            
                                Tacit function composition in Haskell
                            
                                Can GHC warn if class instance is a loop?
                            
                                With monads, can join be defined in terms of bind?
                            
                                If I come from an imperative programming background, how do I wrap my head around the idea of no dynamic variables to keep track of things in Haskell?
                            
                                Load pure global variable from file
                            
                                (Num a) vs Integer type inference

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With