Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Speed-up a single task using multi-processing or threading

Is it possible to speed up a single task using multi-processing/threading? My gut feeling is that the answer is 'no'. Here is an example of what I mean by a "single task":

for i in range(max):
    pick = random.choice(['on', 'off', 'both'])

With an argument of 10000000 it takes about 7.9 seconds to complete on my system.

I have a basic grasp of how to use multi-processing and threading for multiple tasks. For example, if I have 10 directories each one containing X number of files that need to be read, I could use create 10 threads.

I suspect that the single task is using only a single process (task manager reports CPU usage is minimal). Is there a way to leverage my other cores in such cases? Or is increasing the CPU/Memory speeds the only way to get faster results?

like image 935
Dirty Penguin Avatar asked Jun 23 '13 15:06

Dirty Penguin


1 Answers

Here is a benchmark of your code with and without multiprocessing:

#!/usr/bin/env python

import random
import time

def test1():
    print "for loop with no multiproc: "
    m = 10000000
    t = time.time()
    for i in range(m):
        pick = random.choice(['on', 'off', 'both'])
    print time.time()-t

def test2():
    print "map with no multiproc: "
    m = 10000000
    t = time.time()
    map(lambda x: random.choice(['on', 'off', 'both']), range(m))
    print time.time()-t

def rdc(x):
    return random.choice(['on', 'off', 'both'])

def test3():
    from multiprocessing import Pool

    pool = Pool(processes=4)
    m = 10000000

    print "map with multiproc: "
    t = time.time()

    r = pool.map(rdc, range(m))
    print time.time()-t

if __name__ == "__main__":
    test1()
    test2()
    test3()

And here is the result on my workstation (which is a quadcore):

for loop with no multiproc: 
8.31032013893
map with no multiproc: 
9.48167610168
map with multiproc: 
4.94983720779

Is it possible to speed up a single task using multi-processing/threading? My gut feeling is that the answer is 'no'.

well, afaict, the answer is "damn, yes".

Is there a way to leverage my other cores in such cases? Or is increasing the CPU/Memory speeds the only way to get faster results?

yes, by using multiprocessing. Python can't handle multiple cores by using threading, because of the GIL, but it can rely on your operating system's scheduler to leverage the other cores. Then you can get a real improvement on your tasks.

like image 88
zmo Avatar answered Oct 05 '22 12:10

zmo