Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What causes "Error C stack overflow" in haskell usually

What are the usual causes of "Error C stack overflow" in the Hugs Haskell implementation.

like image 570
Nubkadiya Avatar asked Dec 12 '22 23:12

Nubkadiya


1 Answers

This can come up if you are used to functional languages in which it is common to do tail-recursion factorization. Say you have a function:

sum = go 0
    where
    go accum [] = accum
    go accum (x:xs) = go (accum+x) xs

Which, incidentally, is the same as

sum = foldl (+) 0

If we evaluate the function we can see the problem:

sum [1,2,3,4]
go 0 [1,2,3,4]
go (0+1) [2,3,4]
go ((0+1)+2) [3,4]
go (((0+1)+2)+3) [4]
go ((((0+1)+2)+3)+4) []
(((0+1)+2)+3)+4

Now to evaluate that expression Haskell uses a stack:

EXPR            | STACK
(((0+1)+2)+3)+4 | 
((0+1)+2)+3     | +4
(0+1)+2         | +3 +4
(0+1)           | +2 +3 +4
1               | +2 +3 +4
3               | +3 +4
6               | +4
10              |

And this is where an overflow can occur. If you evaluated sum [1..10^6], that stack would be a million entries deep.

But note the pathology here. You recurse over a list only to build up a huge fscking expression ("thunk"), and then use a stack to evaluate it. We would rather evaluate it as we are recursing, by making the tail recursion strict:

sum = go 0
    where
    go accum [] = accum
    go accum (x:xs) = accum `seq` go (accum+x) xs

That says to evaluate accum before trying to evaluate the recursive call, so we get (this may take a some patience to understand):

sum [1,2,3,4]
go 0 [1,2,3,4]
let accum = 0 in accum `seq` go (accum+1) [2,3,4]
go (0+1) [2,3,4]
let accum = 0+1 in accum `seq` go (accum+2) [3,4]
go (1+2) [3,4]
let accum = 1+2 in accum `seq` go (accum+3) [4]
go (3+3) [4]
let accum = 3+3 in accum `seq` go (accum+4) []
go (6+4) []
6+4
10

So as we are traversing the list, we are computing the sum so we don't have to use a deep stack to evaluate the result. This modified tail recursion pattern is encapsulated in Data.List.foldl', so:

sum = foldl' (+) 0

A good rule of thumb is to never use foldl, because it always builds up giant thunks. Use foldl' instead. If the output type has lazy structure (eg. a list or a function), use foldr. But there is no silver bullet for avoiding a stack overflow in general, you just have to understand the evaluation behavior of your program. This can be hard sometimes.

But, if I understand correctly, a stack overflow always comes from trying to evaluate a gigantic thunk, though. So look for places where those could be created, and force evaluation to happen earlier.

like image 109
luqui Avatar answered Jan 05 '23 09:01

luqui