Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running PyTorch multiprocessing in a Docker container with Gunicorn worker manager

  • I am trying to deploy a service on GCP. It's a Docker container that uses Gunicorn for worker management.

  • The code runs a torch.multiprocessing.process to run a POST response as a background process.

This works if I run the script using a python3 command. But hangs when using Gunicorn.

  • My understanding is that CUDA needs threadsafe multiprocessing and that is why torch has its own implementation. When we set up Gunicorn to manage workers, this may be causing some conflict or thread safety issues.

Has anyone come across this before. Would there be a different worker manager that I could be using?

In Dockerfile: CMD gunicorn -w 1 -t 6000 -b 0.0.0.0:8080 --timeout 6000 --preload app_script:app - this is how i am running the app in docker. So yes I am using preload. And the issue happens even if I run the docker container locally so its not just a gcp situation

p=torch.multiprocessing.Process(target=my_function args=(args, )) . p.start() - this is how a post call is getting handled.

like image 467
Suryatapa Roy Avatar asked Dec 02 '19 18:12

Suryatapa Roy


1 Answers

I spent a lot of time investigating a similar issue. Pytorch calls were stuck when running on a docker container with gunicorn.

The solution that worked for me was removing the --preload flag from the Docker gunicorn command.

like image 164
NitzanS Avatar answered Oct 01 '22 16:10

NitzanS