Distributing jobs evenly across multiple GPUs with `multiprocessing.Pool`

Tags:

python-multiprocessing

Let's say that I have the following:

A system with 4 GPUs.
A function, foo, which may be run up to 2 times simultaneously on each GPU.
A list of files that need to be processed using foo in any order. However, each file takes an unpredictable amount of time to be processed.

I would like to process all the files, keeping all the GPUs as busy as possible by ensuring there are always 8 instances of foo running at any given time (2 instance on each GPU) until less than 8 files remain.

The actual details of invoking the GPU are not my issue. What I'm trying to figure out is how to write the parallelization so that I can keep 8 instances of foo running but somehow making sure that exactly 2 of each GPU ID are used at all times.

I've come up with one way to solve this problem using multiprocessing.Pool, but the solution is quite brittle and relies on (AFAIK) undocumented features. It relies on the fact that the processes within the Pool are named in the format FormPoolWorker-%d where %d is a number between one and the number of processes in the pool. I take this value and mod it with the number of GPUs and that gives me a valid GPU id. However, it would be much nicer if I could somehow give the GPU id directly to each process, perhaps on initialization, instead of relying on the string format of the process names.

One thing I considered is that if the initializer and initargs parameters of Pool.__init__ allowed for a list of initargs so that each process could be initialized with a different set of arguments then the problem would be moot. Unfortunately that doesn't appear to work.

Can anybody recommend a more robust or pythonic solution to this problem?

Hacky solution (Python 3.7):

from multiprocessing import Pool, current_process

def foo(filename):
    # Hacky way to get a GPU id using process name (format "ForkPoolWorker-%d")
    gpu_id = (int(current_process().name.split('-')[-1]) - 1) % 4

    # run processing on GPU <gpu_id>
    ident = current_process().ident
    print('{}: starting process on GPU {}'.format(ident, gpu_id))
    # ... process filename
    print('{}: finished'.format(ident))

pool = Pool(processes=4*2)

files = ['file{}.xyz'.format(x) for x in range(1000)]
for _ in pool.imap_unordered(foo, files):
    pass
pool.close()
pool.join()

475

asked Nov 22 '18 01:11

jodag

1 Answers

I figured it out. It's actually quite simple. All we need to do is use a multiprocessing.Queue to manage the available GPU IDs. Start by initializing the Queue to contain 2 of each GPU ID, then get the GPU ID from the queue at the beginning of foo and put it back at the end.

from multiprocessing import Pool, current_process, Queue

NUM_GPUS = 4
PROC_PER_GPU = 2    

queue = Queue()

def foo(filename):
    gpu_id = queue.get()
    try:
        # run processing on GPU <gpu_id>
        ident = current_process().ident
        print('{}: starting process on GPU {}'.format(ident, gpu_id))
        # ... process filename
        print('{}: finished'.format(ident))
    finally:
        queue.put(gpu_id)

# initialize the queue with the GPU ids
for gpu_ids in range(NUM_GPUS):
    for _ in range(PROC_PER_GPU):
        queue.put(gpu_ids)

pool = Pool(processes=PROC_PER_GPU * NUM_GPUS)
files = ['file{}.xyz'.format(x) for x in range(1000)]
for _ in pool.imap_unordered(foo, files):
    pass
pool.close()
pool.join()

120

answered Sep 28 '22 01:09

jodag

Related questions
                            
                                Subset a data frame based on index value [duplicate]
                            
                                Plotly: How to plot just month and day on x axis? (Ignore year)
                            
                                Handle invalid/corrupted image files in ImageDataGenerator.flow_from_directory in Keras
                            
                                python logging print traceback only in debug
                            
                                Writing a pyo3 function equivalent to a Python function that returns its input object
                            
                                OverflowError: MongoDB can only handle up to 8-byte ints?
                            
                                BOTO3 - generate_presigned_url for `put_object` return `The request signature we calculated does not match the signature you provided`
                            
                                How can I open a .snappy.parquet file in python?
                            
                                Django Admin Form: Set the default value of a readonly field
                            
                                Why import class from another file will call __init__ function?
                            
                                FastAI library v1 with Google Colab
                            
                                How to install mpl_finance packages into environment on Anaconda?
                            
                                pip install urllib3 hanging on "Caching due to etag"
                            
                                How do I generate python grpc code from within a setuptools installer (setup.py)?
                            
                                How to compute Shannon entropy of Information from a Pandas Dataframe?
                            
                                How does sys.executable determine the interpreter path?
                            
                                From pathlib parts tuple to string path
                            
                                Adding a new column in the first ordinal position in a pyspark dataframe
                            
                                For loop to print old value and sum of old value
                            
                                ValueError: Found array with 0 sample (s) (shape= (0, 1) while a minimum of 1 is required by MinMaxScaler

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Distributing jobs evenly across multiple GPUs with `multiprocessing.Pool`

Tags:

python

python-multiprocessing

jodag

People also ask

1 Answers

jodag

Recent Activity

Donate For Us