I have a question concerning the definition of strict vs non-strict. The Haskell wiki-book for Laziness (http://en.wikibooks.org/wiki/Haskell/Laziness), under the section "Black-box strictness analysis", makes the following assertion:
[Assuming a function f which takes a single parameter.] The function f is a strict function if, and only if, f undefined results in an error being printed and the halting of our program.
The wiki contrasts const
with id
, showing a non-strict and strict function respectively.
My question is that I was under the impression that foldl was evaluated in a non-strict fashion, causing undesirable space-leaks, while foldl' was strict.
However, the above test seems to assert that both foldl and foldl' are strict. That is both functions produce undefined if any of their parameters are undefined:
> Data.List.foldl (+) undefined [1,2,3,4]
Prelude.undefined
> Data.List.foldl' (+) 0 undefined
Prelude.undefined
> Data.List.foldl' (+) undefined [1,2,3,4]
Prelude.undefined
> Data.List.foldl (+) 0 undefined
Prelude.undefined
Could someone please explain what I'm missing?
Thanks!
Haskell is a non-strict language, and most implementations use a strategy called laziness to run your program.
From HaskellWiki. In functional programming, fold (or reduce) is a family of higher order functions that process a data structure in some order and build a return value. This is as opposed to the family of unfold functions which take a starting value and apply it to a function to generate a data structure.
Function arguments A function is considered strict in one of its arguments if, when the function is applied to a bottom value for that argument, the result is bottom.
Haskell is often described as a lazy language.
A better definition is: A function is said to be strict in an argument if it is undefined in the case that this argument is undefined.
Let us look at the following definition of foldl:
foldl f s [] = s
foldl f s (x:xs) = foldl f (f s x) xs
From this the following can be deduced:
f
, because f
s value is immaterial in the first equation.s
either, since the list could be non-empty and f
could be non-strict in it's first argument (think flip const
).f
and s
are, this argument must be evaluated to so called weak head normal form. A value of algebraic data type is in WHNF if you can tell the outermost constructor. Put differently, there is no way that foldl can avoid looking if the list argument is an empty list or not. And hence, it must evaluate the list value at least so far. But if that value is undefined, none of the 2 equations is applicable, hence this application of foldl
is itself undefined.Furthermore, from the fact that foldl
is not strict in the accumulator s
we can also learn why it is in many cases a bad idea to use foldl
: The system sees no reason to actually evaluate the term f s x
, since it is passed to a function that is not strict in the corresponding argument. Indeed, it would be a violation of non-strictness principles if it did evaluate it. Hence, depending on the length of the list, a huge thunk is accumulated in the accumulator:
((((s `f` x_1) `f` x_2) `f` x_3) ....) `f` x_n)
Now, if f
itself is strict in its left argument, like (+)
, any attempt to evaluate the result of foldl with necessity needs stack space proportional to the length of the list. For, evaluation of f ... x_n
, where ...
stands for the left hand side must first evaluate that left hand side, and so on, down to f s x_1
and back.
Hence we have it that
foldl (flip const) 0 [1..n]
is n
, while
foldl const 0 [1..n]
will stack overflow for large enough n
. This is a sure indicator that here are cases where humans are better at computing than machines, as in the second example the length and content of the list are completely irrelevant and most people will see immediately that the result must be 0.
The section of the wiki that you quote is assuming that you're dealing with a function that takes a single argument. The situation with foldl
is different, first because it's taking multiple arguments, but more importantly because one of those arguments is a function. Let's check out a few examples and see exactly when foldl
is strict in its arguments. (Note that the following also applies to foldl'
.)
> foldl undefined 0 []
0
> foldl undefined 0 [1]
*** Exception: Prelude.undefined
For empty lists, foldl
doesn't need the combining function passed to it. In that case, it's not strict in the first argument.
> foldl (+) undefined [1, 2, 3]
*** Exception: Prelude.undefined
> foldl (+) 0 undefined
*** Exception: Prelude.undefined
When you pass in (+)
as the combining function, it's strict in the starting value and the list. Here's another example, with a different combining function:
> foldl (flip (:)) undefined [1, 2, 3]
[3,2,1*** Exception: Prelude.undefined
Interesting. The starting value is undefined but it seems as though the call to foldl
produced some results before throwing an exception. What about this:
> let (x:xs) = foldl (flip (:)) undefined [1, 2, 3]
> x
3
No exceptions there. So we can get a partial result out of foldl
without hitting the dreaded Prelude.undefined
. Why?
Well, the only thing that changed is the combining function that we passed to foldl
. Whereas (+)
has the type (roughly) Int -> Int -> Int
, flip (:)
has the type [a] -> a -> [a]
. Whereas 3 + ((2 + 1) + undefined)
is not in weak head normal form and must be reduced such that the undefined
is evaluated, (3:(2:(1:undefined)))
is in weak head normal form, which does not require further evaluation, and in particular does not require evaluation of undefined
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With