Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is sum slower than foldl' in haskell?

How come the default sum in ghc is ~10 times slower than its foldl' (the stricter equivalent of foldl) equivalent? And if that's the case, why isn't it implemented using foldl'?

import Data.List
> foldl' (+) 0 [1..10^7]
50000005000000
(0.39 secs, 963,528,816 bytes)

> sum [1..10^7]
50000005000000
(4.13 secs, 1,695,569,176 bytes)

For completeness here are also the stats of foldl and foldr.

> foldl (+) 0 [1..10^7]
50000005000000
(4.02 secs, 1,695,828,752 bytes)

> foldr (+) 0 [1..10^7]
50000005000000
(3.78 secs, 1,698,386,648 bytes)

Looks like sum is implemented using foldl since their runtime is similar. Tested on ghc 7.10.2.

like image 621
dimid Avatar asked Apr 28 '16 10:04

dimid


1 Answers

The sum function is implemented using foldl in GHC:

-- | The 'sum' function computes the sum of a finite list of numbers.
sum                     :: (Num a) => [a] -> a
{-# INLINE sum #-}
sum                     =  foldl (+) 0

as can be seen in the source.

It has to be this way, because it is the specification in the Haskell report.

The rationale was likely that for certain lazy element types of the list, foldl is the right thing to do. (I personally think that foldl is almost always wrong and only foldl' should be used.)

With sufficient optimization, GHC will inline that definition, specialize it to the element type at hand, notice that the loop is strict, and force the evaluation of the accumulator in every iteration; effectively turning this into a foldl', as observed by @AndrásKovács.

Since GHC-7.10, sum itself is a method of the Foldable type class, and the default definition goes via foldMap. The instance Foldable [] however overrides this with the above definition of sum.

like image 161
Joachim Breitner Avatar answered Sep 28 '22 19:09

Joachim Breitner