Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how does 'undefined' work in Haskell

I'm curious about the 'undefined' value in Haskell. Its interesting because you can put it just about anywhere, and Haskell will be happy. The following are all a-ok

[1.0, 2.0, 3.0 , undefined]  ::[Float]
[1, 2 ,3 undefined, 102312] :: [Int]
("CATS!", undefined)  :: (String, String)
....and many more

How does undefined work under the hood? What makes it possible to have data that is of every data type? Would it be possible for me to define a value like this that I can put everywhere, or is this some special case?

like image 243
MYV Avatar asked May 25 '13 10:05

MYV


3 Answers

There's nothing really special about undefined. It is just an exceptional value -- you could represent it with an infinite loop, or a crash, or a segfault. One way to write it is as a crash:

undefined :: a
undefined | False = undefined

Or a loop:

undefined = undefined

It is an exceptional value that can be an element of any type.

Since Haskell is lazy you can still use such values in computations. E.g.

 > length [undefined, undefined]
 2

But otherwise, it is just a natural consequence of polymorphism and non-strictness.

like image 99
Don Stewart Avatar answered Nov 05 '22 02:11

Don Stewart


The interesting property you're examining is that undefined has the type a for any type a we choose, i.e. undefined :: a with no constraints. As others have noted, undefined might be thought of as an error or infinite loop. I'd like to argue that it's better thought of as "the vacuously true statement". It's an almost unavoidable hole in any type system closely related to the Halting Problem, but it's fun to think about it from the point of view of logic.


One way to think about programming with types is that it's a puzzle. Someone gives you a type and asks you to implement a function which is of that type. For instance

Question:    fn  ::  a -> a
Answer:      fn  =  \x -> x

is an easy one. We need to produce an a for any type a, but we're given one as an input so we can just return that.

With undefined, this game is always easy

Question:    fn  ::  Int -> m (f [a])
Answer:      fn  =  \i   -> undefined    -- backdoor!

so let's be rid of it. Making sense of undefined is easiest when you live in a world without it. Now our game gets harder. Sometimes it's possible

Question:    fn  :: (forall r. (a -> r) -> r) -> a
Answer:      fn  =  \f                        -> f id

But suddenly it's sometimes not possible as well!

Question:    fn  ::  a -> b
Answer:      fn  =   ???                  -- this is `unsafeCoerce`, btw.
                                          -- if `undefined` isn't fair game,
                                          -- then `unsafeCoerce` isn't either

Or is it?

-- The fixed-point combinator, the genesis of any recursive program

Question:    fix  ::  (a -> a) -> a
Answer:      fix  =   \f       -> let a = f a in a

                                          -- Why does this work?
                                          -- One should be thinking of Russell's 
                                          -- Paradox right about now. This plays
                                          -- the same role as a non-wellfounded set.

Which is legal because Haskell's let binding is lazy and (generally) recursive. Now we're golden.

Question:    fn   ::  a -> b
Answer:      fn   =  \a -> fix id         -- This seems unfair?

Even without having undefined built-in we can rebuild it in our game using any old infinite loop. The types check out. To truly prevent ourselves from having undefined in Haskell we'd need to solve the Halting Problem.

Question:    undefined  ::  a
Answer:      undefined  =   fix id

Now, as you've seen, undefined is useful for debugging since it can be a placeholder for any value. It's unfortunately terrible for operations since it either leads to an infinite loop or an immediate crash. It's also really bad for our game because it lets us cheat. Finally, I hope I've demonstrated that it's pretty hard to not have undefined so long as your language has (potentially infinite) loops.

There exist languages like Agda and Coq which trade away loops in order to truly eliminate undefined. They do this because this game I've invented can, in some cases, actually be very valuable. It can encode statements of logic and thus be used to form very, very rigorous mathematical proofs. Your types represent theorems and your programs are assurances that that theorem is substantiated. The existence of undefined would mean that all theorems are substantiatable and thus make the whole system untrustworthy.

But in Haskell, we're more interested in looping than proofing, so we'd rather have fix than be certain there wasn't an undefined.

like image 31
J. Abrahamson Avatar answered Nov 05 '22 02:11

J. Abrahamson


How does undefined work? Well, the best answer, IMHO, is that it doesn't work. But to understand that answer, we have to work through its consequences, which are not obvious to a newcomer.

Basically, if we have undefined :: a, what that means for the type system is that undefined can appear anywhere. Why? Because in Haskell, whenever you see an expression that has some type, you can specialize the type by consistently substituting all instances of any of its type variables for any other type. Familiar examples would be things like this:

map :: (a -> b) -> [a] -> [b]

-- Substitute b := b -> x
map :: (a -> b -> c) -> [a] -> [b -> c]

-- Substitute a := Int
map :: (Int -> b -> c) -> [Int] -> [b -> c]

-- etc.

In the case of map, how does this work? Well, it comes down to the fact that map's arguments provide everything that's necessary to produce an answer, no matter what substitutions and specializations we make for its type variables. If you have a list and a function that consumes values of the same type as the list's elements, you can do what map does, period.

But in the case of undefined :: a, what this signature would mean is that no matter what type a may get specialized to, undefined is able to produce a value of that type. How can it do it? Well, actually, it can't, so if a program actually reaches a step where the value of undefined is needed, there is no way to continue. The only thing the program can do at that point is to fail

The story behind this other case is similar but different:

loop :: a
loop = loop

Here, we can prove that loop has type a by this crazy-sounding argument: suppose that loop has type a. It needs to produce a value of type a. How can it do it? Easy, it just calls loop. Presto!

That sounds crazy, right? Well, the thing is that it's really no different from what's going on in the second equation of this definition of map:

map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs

In that second equation, f x has type b, and (f x:) has type [b] -> [b]; now to conclude our proof that map indeed has the type our signature claims, we need to produce a [b]. So how are we doing it? By assuming that map has the type we're trying to prove it has!

The way Haskell's type inference algorithm works is that it first guesses that the expression's type is a, and then it only changes its guess when it finds something that contradicts that assumption. undefined typechecks to a because it's a flat-out lie. loop typechecks to a because recursion is allowed, and all loop does is recurse.


EDIT: What the heck, I might as well spell out one example. Here's an informal demonstration of how to infer the type of map from this definition:

map f [] = []
map f (x:xs) = f x : map f xs

It goes like this:

  1. We start by provisionally assuming that map :: a.
  2. But map takes two arguments, so a can't be the type. We revise our assumption to this: map :: a -> b -> c; f :: a.
  3. But as we can see in the first equation, the second argument is a list: map :: a -> [b] -> c; f :: a.
  4. But as we can also see in the first equation, the result is also a list: map :: a -> [b] -> [c]; f :: a.
  5. In the second equation, we're pattern matching the second argument against the constructor (:) :: b -> [b] -> [b]. This means that in that equation, x :: b and xs :: [b].
  6. Consider the right hand side of the second equation. Since the result of map f (x:xs) must be of type [c], that means f x : map f xs must also be of type [c].
  7. Given the type of the constructor (:) :: c -> [c] -> [c], that means that f x :: c and map f xs :: [c].
  8. In (7) we concluded that map f xs :: [c]. We had assumed that in (6), and if we had concluded otherwise in (7) this would have been a type error. We can also now dive into this expression and see what types this requires f and xs to have, but to make a longer story short, everything's going to check out.
  9. Since f x :: c and x :: b, we must conclude that f :: b -> c. So now we get map :: (b -> c) -> [b] -> [c].
  10. We're done.

The same process, but for loop = loop:

  1. We provisionally assume that loop :: a.
  2. loop takes no arguments, so its type is consistent with a so far.
  3. The right hand side of loop is loop, which we've provisionally assigned type a, so that checks out.
  4. There's nothing more to consider; we're done. loop has type a.
like image 3
Luis Casillas Avatar answered Nov 05 '22 01:11

Luis Casillas