I have some code using multi-process like this:
import multiprocessing
from multiprocessing import Pool
pool = Pool(processes=100)
result = []
for job in job_list:
result.append(
pool.apply_async(
handle_job, (job)
)
)
pool.close()
pool.join()
This program is doing heavy calculation on very big data set. So we need multi-process to handle the job concurrently to improve performance.
I have been told that to the hosting system, one docker container is just one process. So I am wondering how my multi-process will be handled in Docker?
Below are my concern:
Since the container is just one process, will my multi-process code become multi-threading in the process?
Will the performance become down? Because the reason I use multi-process is to get job done concurrently to get better performance.
I suspect much of the confusion comes from thinking of containers as a lightweight VM. Instead, think of Linux containers as a way to run a process with some settings for namespaces and cgroups.
One of those namespaces is the pid namespace. When you configure it, you see the first process in that namespace as pid 1 from within the namespace. From another pid namespace, you cannot see those other namespaces, or the host namespace. And on the host, outside of any namespace, you will see all processes, including those in any namespace.
When you fork a new process, you inherit the same namespaces and cgroups, so you will get a new pid within the pid namespace, allowing you to run multiple processes just like any other Linux environment. Inside of the container, you can run a ps
command (assuming it's included in your image) and see multiple processes running:
$ docker run -it --rm busybox /bin/sh
/ # sleep 30s &
/ # ps -ef
PID USER TIME COMMAND
1 root 0:00 /bin/sh
7 root 0:00 sleep 30s
8 root 0:00 ps -ef
Where the advice comes to only run a single process is not from multi-threaded apps, but instead from people treating the container as a lightweight VM. They will spawn multiple applications that have no hard dependency on each other, like a web server and a database and a mail server. When this is done, there are a couple key issues:
In short, the design of managing containers assumes one application per container, and if you break that assumption, you get to keep both pieces when the tooling doesn't support your use case.
A few words of caution:
--init
).If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With