Haskell IO is often explained in terms of the entire program being a pure function (main
) that returns an IO value (often described as an imperative IO program), which is then executed by the runtime.
This mental model works fine for simple examples, but fell over for me as soon as I saw a recursive main
in Learn You A Haskell. For example:
main = do
line <- getLine
putStrLn line
main
Or, if you prefer:
main = getLine >>= putStrLn >> main
Since main
never terminates, it never actually returns an IO value, yet the program endlessly reads and echoes back lines just fine - so the simple explanation above doesn't quite work. Am I missing something simple or is there a more complete explanation (or is it 'simply' compiler magic) ?
Every recursive function must have a base condition that stops the recursion or else the function calls itself infinitely. The Python interpreter limits the depths of recursion to help avoid infinite recursions, resulting in stack overflows. By default, the maximum depth of recursion is 1000 .
A recursive function is one that calls itself. Direct recursion is the act of calling itself.
A recursive function definition has one or more base cases, meaning input(s) for which the function produces a result trivially (without recurring), and one or more recursive cases, meaning input(s) for which the program recurs (calls itself).
In this case, main
is a value of type IO ()
rather than a function. You can think of it as a sequence of IO a
values:
main = getLine >>= putStrLn >> main
This makes it a recursive value, not unlike infinite lists:
foo = 1 : 2 : foo
We can return a value like this without needing to evaluate the whole thing. In fact, it's a reasonably common idiom.
foo
will loop forever if you try to use the whole thing. But that's true of main
too: unless you use some external method to break out of it, it will never stop looping! But you can start getting elements out of foo
, or executing parts of main
, without evaluating all of it.
The value main
denotes is an infinite program:
main = do
line <- getLine
putStrLn line
line <- getLine
putStrLn line
line <- getLine
putStrLn line
line <- getLine
putStrLn line
line <- getLine
putStrLn line
line <- getLine
putStrLn line
...
But it's represented in memory as a recursive structure that references itself. That representation is finite, unless someone tries to unfold the entire thing to get a non-recursive representation of the entire program - that would never finish.
But just as you can probably figure out how to start executing the infinite program I wrote above without waiting for me to tell you "all" of it, so can Haskell's runtime system figure out how to execute main
without unfolding the recursion up-front.
Haskell's lazy evaluation is actually interleaved with the runtime system's execution of the main
IO program, so this works even for a function that returns an IO
action which recursively invokes the function, like:
main = foo 1
foo :: Integer -> IO ()
foo x = do
print x
foo (x + 1)
Here foo 1
is not a recursive value (it contains foo 2
, not foo 1
), but it's still an infinite program. However this works just fine, because the program denoted by foo 1
is only generated lazily on-demand; it can be produced as the runtime system's execution of main
goes along.
By default Haskell's laziness means that nothing is evaluated until it's needed, and then only "just enough" to get past the current block. Ultimately the source of all the "need" in "until it's needed" comes from the runtime system needing to know what the next step in the main
program is so it can execute it. But it's only ever the next step; the rest of the program after that can remain unevaluated until after the next step has been fully executed. So infininte programs can be executed and do useful work so long as it's always only a finite amount of work to generate "one more step".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With