Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Parallel Fibonacci example from "Parallel and Concurrent Programming" [duplicate]

I tried running the first example here: http://chimera.labs.oreilly.com/books/1230000000929/ch03.html

Code: https://github.com/simonmar/parconc-examples/blob/master/strat.hs

import Control.Parallel
import Control.Parallel.Strategies (rpar, Strategy, using)
import Text.Printf
import System.Environment

-- <<fib
fib :: Integer -> Integer
fib 0 = 1
fib 1 = 1
fib n = fib (n-1) + fib (n-2)
-- >>

main = print pair
 where
  pair =
-- <<pair
   (fib 35, fib 36) `using` parPair
-- >>

-- <<parPair
parPair :: Strategy (a,b)
parPair (a,b) = do
  a' <- rpar a
  b' <- rpar b
  return (a',b')
-- >>

I've built using ghc 7.10.2 (on OSX, with a multicore machine) using the following command:

ghc -O2 strat.hs -threaded -rtsopts -eventlog

And run using:

./strat +RTS -N2 -l -s

I expected the 2 fibs calculations to be run in parallel (previous chapter examples worked as expected, so no setup issues), and I wasn't getting any speedup at all, as seen here:

  % ./strat +RTS -N2 -l -s
(14930352,24157817)
   3,127,178,800 bytes allocated in the heap
       6,323,360 bytes copied during GC
          70,000 bytes maximum residency (2 sample(s))
          31,576 bytes maximum slop
               2 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0      5963 colls,  5963 par    0.179s   0.074s     0.0000s    0.0001s
  Gen  1         2 colls,     1 par    0.000s   0.000s     0.0001s    0.0001s

  Parallel GC work balance: 2.34% (serial 0%, perfect 100%)

  TASKS: 6 (1 bound, 5 peak workers (5 total), using -N2)

  SPARKS: 2 (0 converted, 0 overflowed, 0 dud, 1 GC'd, 1 fizzled)

  INIT    time    0.000s  (  0.001s elapsed)
  MUT     time    1.809s  (  1.870s elapsed)
  GC      time    0.180s  (  0.074s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    1.991s  (  1.945s elapsed)

  Alloc rate    1,728,514,772 bytes per MUT second

  Productivity  91.0% of total user, 93.1% of total elapsed

gc_alloc_block_sync: 238
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 0

-N1 gets similar results (omitted).

The # of GC collections seemed suspicious, as pointed out by others in #haskell-beginners, so I tried adding -A16M when running. The results looked much more in line with expectations:

  % ./strat +RTS -N2 -l -s -A16M
(14930352,24157817)
   3,127,179,920 bytes allocated in the heap
         260,960 bytes copied during GC
          69,984 bytes maximum residency (2 sample(s))
          28,320 bytes maximum slop
              33 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0       115 colls,   115 par    0.105s   0.002s     0.0000s    0.0003s
  Gen  1         2 colls,     1 par    0.000s   0.000s     0.0002s    0.0002s

  Parallel GC work balance: 71.25% (serial 0%, perfect 100%)

  TASKS: 6 (1 bound, 5 peak workers (5 total), using -N2)

  SPARKS: 2 (1 converted, 0 overflowed, 0 dud, 0 GC'd, 1 fizzled)

  INIT    time    0.001s  (  0.001s elapsed)
  MUT     time    1.579s  (  1.087s elapsed)
  GC      time    0.106s  (  0.002s elapsed)
  EXIT    time    0.000s  (  0.000s elapsed)
  Total   time    1.686s  (  1.091s elapsed)

  Alloc rate    1,980,993,138 bytes per MUT second

  Productivity  93.7% of total user, 144.8% of total elapsed

gc_alloc_block_sync: 27
whitehole_spin: 0
gen[0].sync: 0
gen[1].sync: 0

The question is: Why is this the behavior? Even with frequent GC, I still intuitively expect the 2 sparks to run in parallel in the other 90% of the running time.

like image 928
Simon Avatar asked Mar 28 '16 20:03

Simon


People also ask

Is parallel programming and concurrent programming the same?

A system is said to be concurrent if it can support two or more actions in progress at the same time. A system is said to be parallel if it can support two or more actions executing simultaneously. The key concept and difference between these definitions is the phrase "in progress."

Is parallel and concurrent same?

Concurrency is when multiple tasks can run in overlapping periods. It's an illusion of multiple tasks running in parallel because of a very fast switching by the CPU. Two tasks can't run at the same time in a single-core CPU. Parallelism is when tasks actually run in parallel in multiple CPUs.


1 Answers

Yes, this is actually a bug in GHC 8.0.1 and earlier (I'm working on fixing it for 8.0.2). The problem is that the fib 35 and fib 36 expressions are constant and so GHC lifts them to the top level as CAFs, and the RTS was wrongly assuming that the CAFs were unreachable and so garbage collecting the sparks.

You can work around it by making the expressions non-constant by passing in parameters on the command line:

main = do
     [a,b] <- map read <$> getArgs
     let pair = (fib a, fib b) `using` parPair
     print pair

and then run the program with ./strat 35 36.

like image 79
Simon Marlow Avatar answered Oct 21 '22 02:10

Simon Marlow