Run inference on CPU using pytorch and multiprocessing

Tags:

I have trained a CNN model on GPU using FastAI (PyTorch backend). I am now trying to use that model for inference on the same machine, but using CPU instead of GPU. Along with that, I am also trying to make use of multiple CPU cores using the multiprocessing module. Now here is the issue,

Running the code on single CPU (without multiprocessing) takes only 40 seconds to process nearly 50 images

Running the code on multiple CPUs using torch multiprocessing takes more than 6 minutes to process the same 50 images

from torch.multiprocessing import Pool, set_start_method
os.environ['CUDA_VISIBLE_DEVICES']=""
from fastai.vision import *
from fastai.text import *
defaults.device = torch.device('cpu')

def process_image_batch(batch):

    learn_cnn  = load_learner(scripts_folder, 'cnn_model.pkl')
    learn_cnn.model.training = False    
    learn_cnn.model = learn_cnn.model.eval()
    # for image in batch: 
    #     prediction = ... # predicting the image here
    #     return prediction

if __name__ == '__main__':
    #
    # image_batches = ..... # retrieving the image batches (It is a list of 5 lists)
    # n_processes = 5
    set_start_method('spawn', force=True)
    try:
        pool = Pool(n_processes)
        pool.map(process_image_batch, image_batches)
    except Exception as e:
        print('Main Pool Error: ', e)
    except KeyboardInterrupt:
        exit()
    finally:
        pool.terminate()
        pool.join()

I am not sure what's causing this slowdown in multiprocessing mode. I've read a lot of posts discussing similar issue but couldn't find a proper solution anywhere.

469

asked Sep 28 '19 20:09

asanoop24

1 Answers

I think you have done a very naive mistake here, you are reading the model object in the function which you are parallelizing.

Meaning for every single image, you are reloading the model from the disk. Depending on your model object size, IO is gonna be more time consuming then running a forward step.

Please consider reading the model once in the main thread and then make the object available for inference in the parallel function.

169

answered Oct 25 '22 17:10

tejas

Related questions
                            
                                How to pass the arguments to the new_callable from mock.patch?
                            
                                Remove error from mypy for attributes set dynamically in a Python class
                            
                                combine several GIF horizontally - python
                            
                                Error while installing debian packages programmitically using apt_pkg
                            
                                How to create Keras model with optional inputs
                            
                                How can I wait until I receive data using a Python socket?
                            
                                vscode python remote interpreter
                            
                                Pip install - do downloaded whl files persist & take disk space?
                            
                                Given a list of words and a sentence find all words that appear in the sentence either in whole or as a substring
                            
                                Training hyperparameters for multidimensional Gaussian process regression
                            
                                How to change the length of a Primary Key field in Alembic?
                            
                                MAP@k computation
                            
                                ImportError - attempted relative import with no known parent package
                            
                                Which backend for matplotlib using MacOS?
                            
                                Selenium Chromedriver not navigating to url
                            
                                Pandas Storing df to csv in BytesIO
                            
                                Join the two images
                            
                                Heroku python app failing to build when installing sqlite3
                            
                                How to mount google drive to R notebook in colab?
                            
                                Why does * work differently in assignment statements versus function calls?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Run inference on CPU using pytorch and multiprocessing

Tags:

python

parallel-processing

multiprocessing

pytorch

fast-ai

asanoop24

People also ask

1 Answers

tejas

Recent Activity

Donate For Us