Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Questions about folds and stack overflows

Learn You a Haskell talks about foldl' as an alternative to foldl because foldl is prone to stack overflows.

  • According to LYAH, foldl (+) 0 (replicate 1000000 1) should stack overflow, but it doesn't on my machine. Why not? Even if I increase that number to 10 million it doesn't overflow. It just takes up a lot of memory until my OS X computer becomes unusable and I have to reboot it.
  • In what cases should I use foldl instead of foldl'? In my experience foldl' "just works" whereas foldl can essentially crash my computer (see above).
  • I don't understand why there is nothing similar for foldr. Why can't foldr stack overflow and why is there no foldr'?
like image 706
stackoverflowuser Avatar asked Feb 12 '15 16:02

stackoverflowuser


2 Answers

It doesn't crash with stack overflow because the stack is now infinite by default. That is, the default GHC runtime behavior is to allow the stack to grow indefinitely -- there's no bound which can trigger a stack overflow error.

https://ghc.haskell.org/trac/ghc/ticket/8189

Here is a description how it works:

Thread stacks (including the main thread's stack) live on the heap. As the stack grows, new stack chunks are added as required; if the stack shrinks again, these extra stack chunks are reclaimed by the garbage collector. The default initial stack size is deliberately small, in order to keep the time and space overhead for thread creation to a minimum, and to make it practical to spawn threads for even tiny pieces of work.

like image 146
Yuras Avatar answered Nov 08 '22 18:11

Yuras


Why can't foldr stack overflow and why is there no foldr'?

Well, foldr is not tail recursive, i.e. it does not directly call upon itself:

foldr f a (x:xs) = f x (foldr f a xs)

After foldr is reduced, the control passes to the user-provided f. Therefore, there's no need to have a foldr' that forces the arguments before passing them to f: if that's wanted, the caller can just pass a strict f (e.g. exploiting bang patterns or seq).

Whether it does overflow the stack depends on what f does. For instance,

f x y = x

will cause the list to be accessed only in its first element. On the contrary,

f x y = seq y x

could cause a stack overflow (or an memory-intensive behavior). Instead,

f x y = x+1 : y

will cause the output list to be produced lazily, similarly to map (+1) xs, without any nasty surprise.

As @dfeuer points out, there exists Data.Foldable.foldr' which operates on any Foldable as a strict right fold. On lists, that is pretty much redundant, as discussed above. On other Foldable types, it can instead be meaningful.

like image 32
chi Avatar answered Nov 08 '22 18:11

chi