This question is part of my continuing exploration of Docker and in some ways follows up on one of my earlier questions. I have now understood how one can get a full application stack (effectively a mini VPS) working by linking together a bunch of Docker containers. For example one could create a stack that provides Apache + PHP5 with a sheaf of extensions + Redis + MemCached + MySQL all running on top of Ubuntu with or without an additional data container to make it easy to serialize user data.
All very nice and elegant. However, I cannot but help wonder... . 5 containers to run that little VPS (I count 5 not 6 since Apache + PHP5 go into one container). So suppose I have 100 such VPSs running? That means I have 500 containers running! I understand the arguments here - it is easy to compose new app stacks, update one component of the stack etc. But are there no unnecessary overheads to operating this way?
Suppose I did this
Write up a little shell script
!/bin/bash service memcached start service redis-server start .... service apache2 start while: do : done
In my Dockerfile I have
ADD start.sh /usr/local/bin/start.sh
RUN chmod +x /usr/local/bin/start.sh
....
ENTRYPOINT ["/bin/bash"]
CMD ["/usr/local/bin/start.sh"]
I then get that container up & running
docker run -d -p 8080:80 -v /var/droidos/site:/var/www/html -v /var/droidos/logs:/var/log/apache2 droidos/minivps
and I am in business. Now when I want to shut down that container programmatically I can do so by executing one single docker command.
There are many questions of a similar nature to be found when one Google's for them. Apart from the arguments I have reproduced above one of the commonest reasons given for the one-app-per-container approach is "that is the way Docker is designed to work". What I would like to know
It's ok to have multiple processes, but to get the most benefit out of Docker, avoid one container being responsible for multiple aspects of your overall application. You can connect multiple containers using user-defined networks and shared volumes.
Container-based application design encourages certain principles. One of these principles is that there should just be one process running in a container. That is to say, a Docker container should have just one program running inside it.
A container is basically a process. There is no technical issue with running 500 processes on a decent-sized Linux system, although they will have to share the CPU(s) and memory.
The cost of a container over a process is some extra kernel resources to manage namespaces, file systems and control groups, and some management structures inside the Docker daemon, particularly to handle stdout
and stderr
.
The namespaces are introduced to provide isolation, so that one container does not interfere with any others. If your groups of 5 containers form a unit that does not need this isolation then you can share the network namespace using --net=container
. There is no feature at present to share cgroups, AFAIK.
What is wrong with what you suggest:
stdout
and stderr
will be intermingled for the five processes@Bryan's answer is solid, particularly in relation to the overheads of a container that just runs one process being low.
That said, you should at least read the arguments at https://phusion.github.io/baseimage-docker/, which makes a case for having containers with multiple processes. Without them, docker is light on provision for:
baseimage-docker runs an init process which fires up a few processes besides the main one in the container.
For some purposes this is a good idea, but also be aware that for instance having a cron daemon and a syslog daemon per container adds up a bit more overhead. I expect that as the docker ecosystem matures we'll see better solutions that don't require this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With