Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum all numbers from one to a billion in Haskell

Currently i'm catching up on Haskell, and I'm super impressed so far. As a super simple test I wrote a program which computes the sum up till a billion. In order to avoid list creation, I wrote a function which should be tail recursive

summation start upto 
  | upto == 0 = start
  | otherwise = summation (start+upto) (upto-1)

main = print $ summation 0 1000000000

running this with -O2 I get a runtime of about ~20sec on my machine, which kind of surprised me, since I thought the compiler would be more optimising. As a comparison I wrote a simple c++ program

#include <iostream>

int main(int argc, char *argv[]) {
  long long result = 0;
  int upto = 1000000000;

  for (int i = 0; i < upto; i++) {
    result += i;
  }
  std::cout << result << std::end;
  return 0;
}

compiling with clang++ without optimisation the runtime is ~3secs. So I was wondering why my Haskell solution is so slow. Has anybody an idea?

On OSX:

clang++ --version:

Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.2.0
Thread model: posix

ghc --version:

The Glorious Glasgow Haskell Compilation System, version 7.10.3
like image 682
Moe Avatar asked Jan 26 '16 17:01

Moe


2 Answers

Adding a type signature dropped my runtime from 14.35 seconds to 0.27. It is now faster than the C++ on my machine. Don't rely on type-defaulting when performance matters. Ints aren't preferable for, say, modeling a domain in a web application, but they're great if you want a tight loop.

module Main where

summation :: Int -> Int -> Int
summation start upto 
  | upto == 0 = start
  | otherwise = summation (start+upto) (upto-1)

main = print $ summation 0 1000000000


[1 of 1] Compiling Main             ( code/summation.hs, code/summation.o )
Linking bin/build ...
500000000500000000
14.35user 0.06system 0:14.41elapsed 100%CPU (0avgtext+0avgdata 3992maxresident)k
0inputs+0outputs (0major+300minor)pagefaults 0swaps

Linking bin/build ...
500000000500000000
0.27user 0.00system 0:00.28elapsed 98%CPU (0avgtext+0avgdata 3428maxresident)k
0inputs+0outputs (0major+171minor)pagefaults 0swaps
like image 188
bitemyapp Avatar answered Oct 12 '22 20:10

bitemyapp


Skip the strike-out unless you want to see the unoptimized (non -O2) view.

Lets look at the evaluation:

summation start upto 
  | upto == 0 = start
  | otherwise = summation (start+upto) (upto-1)

main = print $ summation 0 1000000000

-->

summation 0 1000000000

-->

summations (0 + 1000000000) 999999999

-->

summation (0 + 1000000000 + 999999999) 999999998

-->

summation (0 + 1000000000 + 999999999 + 999999998) 999999997

EDIT: I didn't see that you had compiled with -O2 so the above isn't occuring. The accumulator, even without any strictness annotations, suffices most of the time with proper optimization levels.

Oh no! You are storing one billion numbers in a big thunk that you aren't evaluating! Tisk! There are lots of solutions using accumulators and strictness - it seems like most stackoverflow answers with anything near this question will suffice to teach you those in addition to library functions, like fold{l,r}, that help you avoid writing your own primitive recursive functions. Since you can look around and/or ask about those concepts I'll cut to the chase with this answer.

If you really want to do this the correct way then you'd use a list and learn that Haskell compilers can do "deforestation" which means the billion-element list is never actually allocated:

main = print (sum [0..1000000000])

Then:

% ghc -O2 x.hs
[1 of 1] Compiling Main             ( x.hs, x.o )
Linking x ...
% time ./x
500000000500000000
./x  16.09s user 0.13s system 99% cpu 16.267 total

Cool, but why 16 seconds? Well by default those values are Integers (GMP integers for the GHC compiler) and that's slower than a machine Int. Lets use Int!

% cat x.hs
main = print (sum [0..1000000000] :: Int)
tommd@HalfAndHalf /tmp% ghc -O2 x.hs && time ./x
500000000500000000
./x  0.31s user 0.00s system 99% cpu 0.311 total
like image 28
Thomas M. DuBuisson Avatar answered Oct 12 '22 18:10

Thomas M. DuBuisson