Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Instance of poorly performing non tail-recursive function

Below are two functions defined to find the maximum value of a list of numbers.

mx :: (Ord a) => [a] -> a
mx [] = error "Empty list"
mx [x] = x
mx (x:xs) 
    | x > (mx xs) = x
    | otherwise = (mx xs)

mx' (x:xs) = findMax x xs
  where
    findMax cmx [] = cmx
    findMax cmx (x:xs) | x > cmx = findMax x xs
               | otherwise = findMax cmx xs

main = do 
    print $ mx [1..30]

Timing the above code, first for mx' (tail-recursive) and next for mx (non-tail-recursive), we have the following timings.

Lenovo-IdeaPad-Y510P:/tmp$ time ./t 
30

real    0m0.002s
user    0m0.000s
sys 0m0.001s
Lenovo-IdeaPad-Y510P:/tmp$ ghc -O2 t.hs
[1 of 1] Compiling Main             ( t.hs, t.o )
Linking t ...
Lenovo-IdeaPad-Y510P:/tmp$ time ./t 
30

real    0m6.272s
user    0m6.274s
sys 0m0.000s

Can someone please explain why there is such a massive difference in performance for a list of just 30 elements?

like image 843
bomp Avatar asked Nov 14 '14 10:11

bomp


2 Answers

As others pointed out, GHC does not do common subexpression elimination (CSE), causing your first snippet to run in exponential time.

To see why, consider e.g.

test1 = length [1..1000] + sum [1..1000]
test2 = let l = [1..1000] in length l + sum l

The two examples are semantically equivalent, but test1 runs in constant space while test2 in linear space (the whole 1000 cells get allocated). Basically, in this case CSE negates the advantages of laziness.

Since CSE can lead to worse performance, GHC is quite conservative in applying it.

More explanation in GHC FAQs:

https://www.haskell.org/haskellwiki/GHC/FAQ#Does_GHC_do_common_subexpression_elimination.3F

like image 167
chi Avatar answered Nov 15 '22 12:11

chi


The problem is not so much the tail-recursivity, it's the fact that in mx in the general case you compute mx xs twice: once to compare it to x, and then a second time to return it. Each one of these calls itself calls mx xs twice, which then does the same, etc... resulting in an exponential run time.

You can remove this problem by simply saving the result of the first call to use it the second time:

mx :: (Ord a) => [a] -> a
mx [] = error "Empty list"
mx [x] = x
mx (x:xs) =
    let mxxs = mx xs in
    if x > mxxs then x else mxxs
like image 21
Tarmil Avatar answered Nov 15 '22 11:11

Tarmil