I have an api server running Node.js that was using it's cluster module and testing looked to be pretty good. Now our IT department wants to move to using Docker containers which I am happy about but I've never actually used it other than just playing around. But I had a thought, the Node.js app runs within a single Docker process so the cluster module wouldn't really be the best as the single Docker process can be a slow point of the setup until the request is split up within that process by the cluster module.
So really a cluster of Docker containers running being able to start and stop them on the fly is more important than using Node.js' cluster module correct?
If I have a cluster of containers, would using Node.js' cluster module get me anything? The api endpoints take less than .5sec to return (usually quite a bit less).
I'm using MySQL (believe it's a single server, nothing more currently) so there shouldn't be any reason to use a data integrity solution then.
Docker Swarm is the Docker-native solution for deploying a cluster of Docker hosts. You can use it to quickly deploy a cluster of Docker hosts running either on your local machine or on supported cloud platforms.
A cluster in docker refers to multiple nodes joined using swarm mode. Containers are scheduled across the various nodes, and networking is configured with overlay networking to look similar to bridge networks to the containers, but across multiple nodes.
The difference with PM2 is that it uses the Node. js cluster module. PM2 creates multiple processes and the cluster module is responsible for distributing incoming traffic to each process. With Docker, distribution of traffic is handled by a load balancer, which we'll talk about in a bit.
What I've seen as the best solution when using Docker is to keep as fewer processes per container as possible since containers are lightweight; you don't want processes trying to use more than one CPU. So, running a cluster in the container won't add any value and might worsen latency.
Here https://medium.com/@CodeAndBiscuits/understanding-nodejs-clustering-in-docker-land-64ce2306afef#.9x6j3b8vw Chad Robinson explains the idea in general terms.
Kubernetes, Rancher, Mesos and other container management layers handle the load-balancing. They provide "scheduling" (moving those Docker container slices around different CPUs and machines to get a good usage across the cluster) and "networking" (load balancing inbound requests to those containers) layers internally.
Update
I think it's worth adding the link Why it is recommended to run only one process in a container? where people share their ideas and experiences, but chiefly from Jon there are some interesting points:
Provided that you give a single responsibility (single process, function or concern) to a container: Good idea Docker names this 'concern' ;)
You'll have to measure to be sure, but my hunch would be running with node's cluster module would be worthwhile. It would get you more CPU utilization with the least amount of extra overhead. No extra containers to manage (start, stop, monitor). Plus the cluster workers have an efficient communication mechanism. The most reasonable evolution (don't skip steps) would seem to me:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With