Currently i'm catching up on Haskell, and I'm super impressed so far. As a super simple test I wrote a program which computes the sum up till a billion. In order to avoid list creation, I wrote a function which should be tail recursive
summation start upto
| upto == 0 = start
| otherwise = summation (start+upto) (upto-1)
main = print $ summation 0 1000000000
running this with -O2 I get a runtime of about ~20sec on my machine, which kind of surprised me, since I thought the compiler would be more optimising. As a comparison I wrote a simple c++ program
#include <iostream>
int main(int argc, char *argv[]) {
long long result = 0;
int upto = 1000000000;
for (int i = 0; i < upto; i++) {
result += i;
}
std::cout << result << std::end;
return 0;
}
compiling with clang++ without optimisation the runtime is ~3secs. So I was wondering why my Haskell solution is so slow. Has anybody an idea?
On OSX:
clang++ --version:
Apple LLVM version 7.0.2 (clang-700.1.81)
Target: x86_64-apple-darwin15.2.0
Thread model: posix
ghc --version:
The Glorious Glasgow Haskell Compilation System, version 7.10.3
Adding a type signature dropped my runtime from 14.35 seconds to 0.27. It is now faster than the C++ on my machine. Don't rely on type-defaulting when performance matters. Ints aren't preferable for, say, modeling a domain in a web application, but they're great if you want a tight loop.
module Main where
summation :: Int -> Int -> Int
summation start upto
| upto == 0 = start
| otherwise = summation (start+upto) (upto-1)
main = print $ summation 0 1000000000
[1 of 1] Compiling Main ( code/summation.hs, code/summation.o )
Linking bin/build ...
500000000500000000
14.35user 0.06system 0:14.41elapsed 100%CPU (0avgtext+0avgdata 3992maxresident)k
0inputs+0outputs (0major+300minor)pagefaults 0swaps
Linking bin/build ...
500000000500000000
0.27user 0.00system 0:00.28elapsed 98%CPU (0avgtext+0avgdata 3428maxresident)k
0inputs+0outputs (0major+171minor)pagefaults 0swaps
Skip the strike-out unless you want to see the unoptimized (non -O2) view.
Lets look at the evaluation:
summation start upto
| upto == 0 = start
| otherwise = summation (start+upto) (upto-1)
main = print $ summation 0 1000000000
-->
summation 0 1000000000
-->
summations (0 + 1000000000) 999999999
-->
summation (0 + 1000000000 + 999999999) 999999998
-->
summation (0 + 1000000000 + 999999999 + 999999998) 999999997
EDIT: I didn't see that you had compiled with -O2
so the above isn't occuring. The accumulator, even without any strictness annotations, suffices most of the time with proper optimization levels.
Oh no! You are storing one billion numbers in a big thunk that you aren't evaluating! Tisk! There are lots of solutions using accumulators and strictness - it seems like most stackoverflow answers with anything near this question will suffice to teach you those in addition to library functions, like fold{l,r}
, that help you avoid writing your own primitive recursive functions. Since you can look around and/or ask about those concepts I'll cut to the chase with this answer.
If you really want to do this the correct way then you'd use a list and learn that Haskell compilers can do "deforestation" which means the billion-element list is never actually allocated:
main = print (sum [0..1000000000])
Then:
% ghc -O2 x.hs
[1 of 1] Compiling Main ( x.hs, x.o )
Linking x ...
% time ./x
500000000500000000
./x 16.09s user 0.13s system 99% cpu 16.267 total
Cool, but why 16 seconds? Well by default those values are Integers (GMP integers for the GHC compiler) and that's slower than a machine Int
. Lets use Int
!
% cat x.hs
main = print (sum [0..1000000000] :: Int)
tommd@HalfAndHalf /tmp% ghc -O2 x.hs && time ./x
500000000500000000
./x 0.31s user 0.00s system 99% cpu 0.311 total
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With