Project Euler #14 Tips in Haskell? [closed]

Question

I am trying euler challenge 14. I was wondering if I could have any tips for calculating it quickly in haskell. I tried this naive approach.

import Data.List import Data.Function collatz n | even n = n quot 2 | otherwise = 3*n+1 colSeq = takeWhile (/= 1) . (iterate collatz) main=print $ maximumBy (compare on (length . colSeq)) [1..999999]

But that took too long.

λ <*Main System.Timeout>: timeout (10^6*60) main
Nothing

I also tried using the reverse collatz relation, and keeping the lengths in a map to eliminate redundant calculations, but that didn't work either. And don't want the solution, but does anyone have some mathematical literature, or programming technique that will make this quicker, or do I just have to leave it over night?

Zeta · Accepted Answer

Your program is not as slow as you might think…

First of all, your program runs fine and finishes in under two minutes if you compile with -O2 and increase the stack size (I used +RTS -K100m, but your system might vary):

$ .\collatz.exe +RTS -K100m -s
  65,565,993,768 bytes allocated in the heap
  16,662,910,752 bytes copied during GC
      77,042,796 bytes maximum residency (1129 sample(s))
       5,199,140 bytes maximum slop
             184 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0     124724 colls,     0 par   18.41s   18.19s     0.0001s    0.0032s
  Gen  1      1129 colls,     0 par   16.67s   16.34s     0.0145s    0.1158s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time   39.98s  ( 41.17s elapsed)
  GC      time   35.08s  ( 34.52s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time   75.06s  ( 75.69s elapsed)

  %GC     time      46.7%  (45.6% elapsed)

  Alloc rate    1,639,790,387 bytes per MUT second

  Productivity  53.3% of total user, 52.8% of total elapsed

…but that's still slow

Productivity of ~50% percent means that the GC is using half the time we're staring at the screen, waiting for our result. In our case we create to much garbage by iterating the sequence for every value.

Improvements

The Collatz sequence is a recursive sequence. Therefore, we should define it as a recursive sequence instead of a iterative one and have a look at what happens.

colSeq 1 = [1]
colSeq n 
  | even n    = n : colSeq (n `div` 2) 
  | otherwise = n : colSeq (3 * n + 1)

The list in Haskell is a fundamental type, so GHC should have some nifty optimization (-O2). So lets try this one:

Result

$ .\collatz_rec.exe +RTS -s
  37,491,417,368 bytes allocated in the heap
       4,288,084 bytes copied during GC
          41,860 bytes maximum residency (2 sample(s))
          19,580 bytes maximum slop
               1 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0     72068 colls,     0 par    0.22s    0.22s     0.0000s    0.0001s
  Gen  1         2 colls,     0 par    0.00s    0.00s     0.0001s    0.0001s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time   32.89s  ( 33.12s elapsed)
  GC      time    0.22s  (  0.22s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time   33.11s  ( 33.33s elapsed)

  %GC     time       0.7%  (0.7% elapsed)

  Alloc rate    1,139,881,573 bytes per MUT second

  Productivity  99.3% of total user, 98.7% of total elapsed

Note that we're now up to 99% productivity in ~80% MUT time (compared to the original version). Just by this small change we decreased the runtime tremendously.

Wait, there's more!

There's a thing that's rather strange. Why are we calculating the length of both 1024 and 512? After all, the later cannot create a longer Collatz sequence.

Improvements

However, in this case we must see the problem as one big task, and not a map. We need to keep track of the values we already calculated, and we want to clear those values we already visited.

We use Data.Set for this:

problem_14 :: S.Set Integer -> [(Integer, Integer)]
problem_14 s
  | S.null s  = []
  | otherwise = (c, fromIntegral $ length csq) : problem_14 rest
  where (c, rest') = S.deleteFindMin s
        csq        = colSeq c
        rest       = rest' `S.difference` S.fromList csq

And we use problem_14 like that:

main = print $ maximumBy (compare `on` snd) $ problem_14 $ S.fromList [1..999999]

Result

$ .\collatz_set.exe +RTS -s
  18,405,282,060 bytes allocated in the heap
   1,645,842,328 bytes copied during GC
      27,446,972 bytes maximum residency (40 sample(s))
         373,056 bytes maximum slop
              79 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0     35193 colls,     0 par    2.17s    2.03s     0.0001s    0.0002s
  Gen  1        40 colls,     0 par    0.84s    0.77s     0.0194s    0.0468s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time   14.91s  ( 15.17s elapsed)
  GC      time    3.02s  (  2.81s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time   17.92s  ( 17.98s elapsed)

  %GC     time      16.8%  (15.6% elapsed)

  Alloc rate    1,234,735,903 bytes per MUT second

  Productivity  83.2% of total user, 82.9% of total elapsed

We loose some productivity, but that's reasonable. After all, we're now using Set and not the list anymore and use 79MB instead of 1MB. However, our program now runs in 17s instead of 34s, that's only 25% of the original time.

Using `ST`

Inspiration (C++)

int main(){
  std::vector<bool> Q(1000000,true);

  unsigned long long max_l = 0, max_c = 1;

  for(unsigned long i = 1; i < Q.size(); ++i){
    if(!Q[i])
      continue;
    unsigned long long c = i, l = 0;
    while(c != 1){
      if(c < Q.size()) Q[c] = false;
      c = c % 2 == 0 ? c / 2 : 3 * c + 1;
      l++;
    }
    if(l > max_l){
      max_l = l;
      max_c = i;
    }
  }
  std::cout << max_c << std::endl;
}

This program runs in 130ms. Our yet best version needs 100 times more. We can fix that.

Haskell

problem_14_vector_st :: Int -> (Int, Int)
problem_14_vector_st limit = 
  runST $ do
    q <- V.replicate (limit+1) True    
    best <- newSTRef (1,1)
    forM_ [1..limit] $ \i -> do
      b <- V.read q i
      when b $ do
        let csq = colSeq $ fromIntegral i
        let l   = fromIntegral $ length csq
        forM_ (map fromIntegral csq) $ \j-> 
          when (j<= limit && j>= 0) $  V.write q j False
        m <- fmap snd $ readSTRef best
        when (l > m) $ writeSTRef best (i,l)
    readSTRef best

Result

$ collatz_vector_st.exe +RTS -s
   2,762,282,216 bytes allocated in the heap
      10,021,016 bytes copied during GC
       1,026,580 bytes maximum residency (2 sample(s))
          21,684 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
  Gen  0      5286 colls,     0 par    0.02s    0.02s     0.0000s    0.0000s
  Gen  1         2 colls,     0 par    0.00s    0.00s     0.0001s    0.0001s

  INIT    time    0.00s  (  0.00s elapsed)
  MUT     time    3.09s  (  3.08s elapsed)
  GC      time    0.02s  (  0.02s elapsed)
  EXIT    time    0.00s  (  0.00s elapsed)
  Total   time    3.11s  (  3.11s elapsed)

  %GC     time       0.5%  (0.7% elapsed)

  Alloc rate    892,858,898 bytes per MUT second

  Productivity  99.5% of total user, 99.6% of total elapsed

~3 seconds. Someone else might know more tricks, but that's the most I could squeeze out of Haskell.

Damien Black · Answer

Caching the value of integers you've already hit will save you a lot of time. If you toss in the number 1234, and find that takes 273 steps to get to 1, associate the values. 1234->273.

Now if you ever hit 1234 while in a sequence, you don't have to take 273 more steps to find the answer, just add 273 to your current number and you know the length of the sequence.

Do this for every number you calculate, even the ones in the middle of a sequence. For example, if you are at 1234 and you don't have a value yet, do the step (divide by 2) and calculate and cache the value for 617. You cache almost all the important values really quick this way. There are some really long chains that you'll end up on again and again.

The easiest way to cache all the values as you go is to make a recursive function. Like this (in pseudo-code):

function collatz(number) {
    if number is 1: return 1
    else if number is in cache: return cached value
    else perform step: newnumber = div 2 if even, time 3 + 1 if odd
    steps = collatz(newnumber) + 1 //+1 for the step we just took
    cache steps as the result for number
    return steps
}

Hopefully Haskell won't have problems with the depths of recursion that you'll end up in like this. However, it haskell doesn't like it, you can implement the same thing with a stack, it is just less intuitive.

Project Euler #14 Tips in Haskell? [closed]

Tags:

optimization

math

haskell

PyRulez

2 Answers

Your program is not as slow as you might think…

…but that's still slow

Improvements

Result

Wait, there's more!

Improvements

Result

Using `ST`

Inspiration (C++)

Haskell

Result

Zeta

Damien Black

Recent Activity

Donate For Us

Project Euler #14 Tips in Haskell? [closed]

Tags:

optimization

math

haskell

PyRulez

2 Answers

Your program is not as slow as you might think…

…but that's still slow

Improvements

Result

Wait, there's more!

Improvements

Result

Using ST

Inspiration (C++)

Haskell

Result

Zeta

Damien Black

Related questions

Recent Activity

Donate For Us

Using `ST`