I have a docker swarm with a lot of containers, but in particolar:
My problem is that when a node fails, the manager discards the current container and creates a new one in another node. So everytime i lost the persisting data stored in that particular container even using docker volumes.
So i would create four distributed glusterfs volumes over my cluster, and mount them as docker volumes into my containers.
Is this a correct way to resolve my problem?
If it is, what type of filesystem should i use for my glusterfs volumes?
Are there perfomance problems with this approch?
Volumes are the best way to persist data in Docker. Bind mounts may be stored anywhere on the host system. They may even be important system files or directories. Non-Docker processes on the Docker host or a Docker container can modify them at any time.
Docker Swarm is not being deprecated, and is still a viable method for Docker multi-host orchestration, but Docker Swarm Mode (which uses the Swarmkit libraries under the hood) is the recommended way to begin a new Docker project where orchestration over multiple hosts is required.
Summary. The Docker Swarm mode allows an easy and fast load balancing setup with minimal configuration. Even though the swarm itself already performs a level of load balancing with the ingress mesh, having an external load balancer makes the setup simple to expand upon.
GlusterFS would not be the correct way to resolve this for all of your containers since Gluster does not support "structured data", as stated in the GlusterFS Install Guide:
Gluster does not support so called “structured data”, meaning live, SQL databases. Of course, using Gluster to backup and restore the database would be fine - Gluster is traditionally better when using file sizes at of least 16KB (with a sweet spot around 128KB or so).
One solution to this would be master slave replication for the data in your databases. MySQL and mongoDB both support this (as described here and here), as do most common DBMSs.
Master slave replication is basically where for 2 or more copies of your database, one will be the master and the rest will be slaves. All write operations happen on the master, and all read operations happen on the slaves. Any data written to the master will be replicated across the slaves, by the master. Some DBMSs also provide a way to check if the master goes down and elect a new master if this happens, but I don't think all DBMSs do this.
You could alternatively set up a Galera Cluster, but as far as I'm aware this only supports MySQL.
I would have thought you could use GlusterFS for Fluentd and Elasticsearch, but I'm not familiar with either of those so I couldn't say for certain. I imagine it would depend on how they store any data they collect (if they collect any at all).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With