Python multiprocessing and shared numpy array

Question

I have a problem, which is similar to this:

import numpy as np

C = np.zeros((100,10))

for i in range(10):
    C_sub = get_sub_matrix_C(i, other_args) # shape 10x10
    C[i*10:(i+1)*10,:10] = C_sub

So, apparently there is no need to run this as a serial calculation, since each submatrix can be calculated independently. I would like to use the multiprocessing module and create up to 4 processes for the for loop. I read some tutorials about multiprocessing, but wasn't able to figure out how to use this to solve my problem.

Thanks for your help

Bakuriu · Accepted Answer

A simple way to parallelize that code would be to use a Pool of processes:

pool = multiprocessing.Pool()
results = pool.starmap(get_sub_matrix_C, ((i, other_args) for i in range(10)))

for i, res in enumerate(results):
    C[i*10:(i+1)*10,:10] = res

I've used starmap since the get_sub_matrix_C function has more than one argument (starmap(f, [(x1, ..., xN)]) calls f(x1, ..., xN)).

Note however that serialization/deserialization may take significant time and space, so you may have to use a more low-level solution to avoid that overhead.

It looks like you are running an outdated version of python. You can replace starmap with plain map but then you have to provide a function that takes a single parameter:

def f(args):
    return get_sub_matrix_C(*args)

pool = multiprocessing.Pool()
results = pool.map(f, ((i, other_args) for i in range(10)))

for i, res in enumerate(results):
    C[i*10:(i+1)*10,:10] = res

Python multiprocessing and shared numpy array

Tags:

python

multiprocessing

RoSt

1 Answers

Bakuriu

Recent Activity

Donate For Us

Python multiprocessing and shared numpy array

Tags:

python

multiprocessing

RoSt

1 Answers

Bakuriu

Related questions

Recent Activity

Donate For Us