I'm curious about the 'undefined' value in Haskell. Its interesting because you can put it just about anywhere, and Haskell will be happy. The following are all a-ok
[1.0, 2.0, 3.0 , undefined] ::[Float]
[1, 2 ,3 undefined, 102312] :: [Int]
("CATS!", undefined) :: (String, String)
....and many more
How does undefined work under the hood? What makes it possible to have data that is of every data type? Would it be possible for me to define a value like this that I can put everywhere, or is this some special case?
There's nothing really special about undefined
. It is just an exceptional value -- you could represent it with an infinite loop, or a crash, or a segfault. One way to write it is as a crash:
undefined :: a
undefined | False = undefined
Or a loop:
undefined = undefined
It is an exceptional value that can be an element of any type.
Since Haskell is lazy you can still use such values in computations. E.g.
> length [undefined, undefined]
2
But otherwise, it is just a natural consequence of polymorphism and non-strictness.
The interesting property you're examining is that undefined
has the type a
for any type a
we choose, i.e. undefined :: a
with no constraints. As others have noted, undefined
might be thought of as an error or infinite loop. I'd like to argue that it's better thought of as "the vacuously true statement". It's an almost unavoidable hole in any type system closely related to the Halting Problem, but it's fun to think about it from the point of view of logic.
One way to think about programming with types is that it's a puzzle. Someone gives you a type and asks you to implement a function which is of that type. For instance
Question: fn :: a -> a
Answer: fn = \x -> x
is an easy one. We need to produce an a
for any type a
, but we're given one as an input so we can just return that.
With undefined
, this game is always easy
Question: fn :: Int -> m (f [a])
Answer: fn = \i -> undefined -- backdoor!
so let's be rid of it. Making sense of undefined
is easiest when you live in a world without it. Now our game gets harder. Sometimes it's possible
Question: fn :: (forall r. (a -> r) -> r) -> a
Answer: fn = \f -> f id
But suddenly it's sometimes not possible as well!
Question: fn :: a -> b
Answer: fn = ??? -- this is `unsafeCoerce`, btw.
-- if `undefined` isn't fair game,
-- then `unsafeCoerce` isn't either
Or is it?
-- The fixed-point combinator, the genesis of any recursive program
Question: fix :: (a -> a) -> a
Answer: fix = \f -> let a = f a in a
-- Why does this work?
-- One should be thinking of Russell's
-- Paradox right about now. This plays
-- the same role as a non-wellfounded set.
Which is legal because Haskell's let
binding is lazy and (generally) recursive. Now we're golden.
Question: fn :: a -> b
Answer: fn = \a -> fix id -- This seems unfair?
Even without having undefined
built-in we can rebuild it in our game using any old infinite loop. The types check out. To truly prevent ourselves from having undefined
in Haskell we'd need to solve the Halting Problem.
Question: undefined :: a
Answer: undefined = fix id
Now, as you've seen, undefined
is useful for debugging since it can be a placeholder for any value. It's unfortunately terrible for operations since it either leads to an infinite loop or an immediate crash. It's also really bad for our game because it lets us cheat. Finally, I hope I've demonstrated that it's pretty hard to not have undefined
so long as your language has (potentially infinite) loops.
There exist languages like Agda and Coq which trade away loops in order to truly eliminate undefined
. They do this because this game I've invented can, in some cases, actually be very valuable. It can encode statements of logic and thus be used to form very, very rigorous mathematical proofs. Your types represent theorems and your programs are assurances that that theorem is substantiated. The existence of undefined
would mean that all theorems are substantiatable and thus make the whole system untrustworthy.
But in Haskell, we're more interested in looping than proofing, so we'd rather have fix
than be certain there wasn't an undefined
.
How does undefined
work? Well, the best answer, IMHO, is that it doesn't work. But to understand that answer, we have to work through its consequences, which are not obvious to a newcomer.
Basically, if we have undefined :: a
, what that means for the type system is that undefined
can appear anywhere. Why? Because in Haskell, whenever you see an expression that has some type, you can specialize the type by consistently substituting all instances of any of its type variables for any other type. Familiar examples would be things like this:
map :: (a -> b) -> [a] -> [b]
-- Substitute b := b -> x
map :: (a -> b -> c) -> [a] -> [b -> c]
-- Substitute a := Int
map :: (Int -> b -> c) -> [Int] -> [b -> c]
-- etc.
In the case of map
, how does this work? Well, it comes down to the fact that map
's arguments provide everything that's necessary to produce an answer, no matter what substitutions and specializations we make for its type variables. If you have a list and a function that consumes values of the same type as the list's elements, you can do what map does, period.
But in the case of undefined :: a
, what this signature would mean is that no matter what type a
may get specialized to, undefined
is able to produce a value of that type. How can it do it? Well, actually, it can't, so if a program actually reaches a step where the value of undefined
is needed, there is no way to continue. The only thing the program can do at that point is to fail
The story behind this other case is similar but different:
loop :: a
loop = loop
Here, we can prove that loop
has type a
by this crazy-sounding argument: suppose that loop
has type a
. It needs to produce a value of type a
. How can it do it? Easy, it just calls loop
. Presto!
That sounds crazy, right? Well, the thing is that it's really no different from what's going on in the second equation of this definition of map
:
map :: (a -> b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
In that second equation, f x
has type b
, and (f x:)
has type [b] -> [b]
; now to conclude our proof that map
indeed has the type our signature claims, we need to produce a [b]
. So how are we doing it? By assuming that map
has the type we're trying to prove it has!
The way Haskell's type inference algorithm works is that it first guesses that the expression's type is a
, and then it only changes its guess when it finds something that contradicts that assumption. undefined
typechecks to a
because it's a flat-out lie. loop
typechecks to a
because recursion is allowed, and all loop
does is recurse.
EDIT: What the heck, I might as well spell out one example. Here's an informal demonstration of how to infer the type of map
from this definition:
map f [] = []
map f (x:xs) = f x : map f xs
It goes like this:
map :: a
.a
can't be the type. We revise our assumption to this: map :: a -> b -> c; f :: a
.map :: a -> [b] -> c; f :: a
.map :: a -> [b] -> [c]; f :: a
.(:) :: b -> [b] -> [b]
. This means that in that equation, x :: b
and xs :: [b]
.map f (x:xs)
must be of type [c]
, that means f x : map f xs
must also be of type [c]
.(:) :: c -> [c] -> [c]
, that means that f x :: c
and map f xs :: [c]
.map f xs :: [c]
. We had assumed that in (6), and if we had concluded otherwise in (7) this would have been a type error. We can also now dive into this expression and see what types this requires f
and xs
to have, but to make a longer story short, everything's going to check out.f x :: c
and x :: b
, we must conclude that f :: b -> c
. So now we get map :: (b -> c) -> [b] -> [c]
.The same process, but for loop = loop
:
loop :: a
.loop
takes no arguments, so its type is consistent with a
so far.loop
is loop
, which we've provisionally assigned type a
, so that checks out.loop
has type a
.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With