Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Run inference on CPU using pytorch and multiprocessing

I have trained a CNN model on GPU using FastAI (PyTorch backend). I am now trying to use that model for inference on the same machine, but using CPU instead of GPU. Along with that, I am also trying to make use of multiple CPU cores using the multiprocessing module. Now here is the issue,

Running the code on single CPU (without multiprocessing) takes only 40 seconds to process nearly 50 images

Running the code on multiple CPUs using torch multiprocessing takes more than 6 minutes to process the same 50 images

from torch.multiprocessing import Pool, set_start_method
os.environ['CUDA_VISIBLE_DEVICES']=""
from fastai.vision import *
from fastai.text import *
defaults.device = torch.device('cpu')

def process_image_batch(batch):

    learn_cnn  = load_learner(scripts_folder, 'cnn_model.pkl')
    learn_cnn.model.training = False    
    learn_cnn.model = learn_cnn.model.eval()
    # for image in batch: 
    #     prediction = ... # predicting the image here
    #     return prediction

if __name__ == '__main__':
    #
    # image_batches = ..... # retrieving the image batches (It is a list of 5 lists)
    # n_processes = 5
    set_start_method('spawn', force=True)
    try:
        pool = Pool(n_processes)
        pool.map(process_image_batch, image_batches)
    except Exception as e:
        print('Main Pool Error: ', e)
    except KeyboardInterrupt:
        exit()
    finally:
        pool.terminate()
        pool.join()

I am not sure what's causing this slowdown in multiprocessing mode. I've read a lot of posts discussing similar issue but couldn't find a proper solution anywhere.

like image 469
asanoop24 Avatar asked Sep 28 '19 20:09

asanoop24


People also ask

How do you run an inference in PyTorch?

Run model inference via Pandas UDF Create a custom PyTorch dataset class. Define the function for model inference. Define the function for model inference. Run the model inference and save the result to a Parquet file.

Is PyTorch CPU multithreaded?

PyTorch uses a single thread pool for the inter-op parallelism, this thread pool is shared by all inference tasks that are forked within the application process. In addition to the inter-op parallelism, PyTorch can also utilize multiple threads within the ops ( intra-op parallelism ).

Can PyTorch be used on CPU?

tensors for GPUs can be transferred between CPU tensors easily through PyTorch' API. Because the new tensors are created using the same code as the parent tensor, they are all in the same device. The … tensors for GPUs can be transferred between CPU tensors easily through PyTorch' API.


1 Answers

I think you have done a very naive mistake here, you are reading the model object in the function which you are parallelizing.

Meaning for every single image, you are reloading the model from the disk. Depending on your model object size, IO is gonna be more time consuming then running a forward step.

Please consider reading the model once in the main thread and then make the object available for inference in the parallel function.

like image 169
tejas Avatar answered Oct 25 '22 17:10

tejas