Concurrency in Python, multiprocess slower than single process

Question

I'm currently creating a simple script that simulates a maths problem. The problem is 'The Frog Problem', presented here by Matt Parker of standupmaths on his YouTube channel. But basically, the problem is about a frog trying to hop from one side of a river to another on lillypads in increments. My code simulates this by subtracting a random number from the number of lillypads left and continuing until that number is 0.

This is the entire thing:

import random
import datetime
from multiprocessing import Pool

def frog_time(num_lillypads):
    jumps = 0
    while num_lillypads > 0:
        num_lillypads -= random.randint(1, num_lillypads)
        jumps += 1
    return jumps

def frog_run(num_lillypads, iterations=10000):
    ave = 0
    print("Running {} lillypads.".format(num_lillypads))
    for i in range(1, iterations+1):
        ave = (ave*(i-1)+frog_time(num_lillypads))/i
    return ave

def single_run(max_lillypads, iterations):
    start = datetime.datetime.now()
    results = []
    for i in range(1, max_lillypads+1):
        results.append(frog_run(i, iterations))
    time_taken = datetime.datetime.now() - start
    return time_taken

def timing_run(max_lillypads, iterations):
    start = datetime.datetime.now()
    with Pool() as pool:
        pad_nos = list(range(1, max_lillypads+1))
        results = pool.map(frog_run, range(1, max_lillypads+1))
    time_taken = datetime.datetime.now() - start
    return time_taken

def test(max=1000, iters=10000):
    print("Concurrent run")
    concurrent_time = timing_run(max, iters)
    print("Single run")
    single_time = single_run(max, iters)
    print("Single run took {} to finish.".format(single_time))
    print("Concurrent run took {} to finish.".format(concurrent_time))

I decided to use this as en exercise to practice concurrent programming in Python, but I expected wildly different results. When I run this I get:

Single run took 0:01:55.825933 to finish.
Concurrent run took 0:02:00.110245 to finish.

I thought that the run that implemented multiprocessing would be at least a little bit faster, if not significantly faster, but in this case it actually takes longer!

Can anybody who knows more about python multiprocessing help me out by explaining this result? Is the overhead of creating a new process for each one of these too much to make a difference, or maybe python.random is too slow, or is there something else wrong about this?

MyNameIsCaleb · Accepted Answer

Right now, you aren't specifying an amount of processes to set up so it will default to maximum: [source]

processes is the number of worker processes to use. If processes is None then the number returned by os.cpu_count() is used.

Each worker process takes x amount of time to set up.

So, let's use some arbitrary values to see how we do:
- the function takes 120 seconds to run in one process
- each process takes 5 seconds to start
- each new process can divide the workload equally

If that were the case:

No multiprocessing: 120 seconds
Multiprocessing with 2 processes: 60 seconds + 10 seconds = 70 seconds
Multiprocessing with 3 processes: 40 seconds + 15 seconds = 55 seconds
Multiprocessing with 4 processes: 30 seconds + 20 seconds = 50 seconds
Multiprocessing with 5 processes: 24 seconds + 25 seconds = 49 seconds
Multiprocessing with 6 processes: 20 seconds + 30 seconds = 50 seconds
Multiprocessing with 7 processes: 17 seconds + 35 seconds = 52 seconds

So, there is a point where you don't have gains by using multiprocessing, or you can limit the amount of processes to where you are still saving more time than the time lost creating the processes.

If you use pool(2) or pool(3), etc. you will probably see time gains and then losses again. At much larger scale, the more processes you have the better off you would be, but at small testing scale that may not be the case.

Concurrency in Python, multiprocess slower than single process

Tags:

python

concurrency

multiprocessing

mattrea6

1 Answers

MyNameIsCaleb

Recent Activity

Donate For Us

Concurrency in Python, multiprocess slower than single process

Tags:

python

concurrency

multiprocessing

mattrea6

1 Answers

MyNameIsCaleb

Related questions

Recent Activity

Donate For Us