Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Docker volume in swarm

Can someone confirm this for me.

When running a service in docker with swarm, the docker volume using the local driver will be created on the same node as the running container for a service.

If a service was spawning tasks on 2 different nodes, each container would see different data in their own respective mounted volumes.

For example if I had a service running on node1 which creates/populates volumes, it will always ever be visible only to node1 if volumes are created with the local drivers.

So if I had a service running on node1 which updates a volume called project-addons. And I had a service being spawned on node2, it would be able to mount the project-addons volume but it would be empty. If I wanted to have project-addons populated correctly everywhere. I'd have to run a task on every nodes or use a volume driver that is swarm aware (like replicating data across swarm nodes).

So if my understanding is correct, volumes aren't swarm node specific and can be accessed from anywhere but if the volume driver is local, it's quite possible that I might mount an empty volume.

like image 930
Loïc Faure-Lacroix Avatar asked Mar 21 '19 20:03

Loïc Faure-Lacroix


People also ask

How do volumes work with docker Swarm?

Swarm Mode itself does not do anything different with volumes, it runs any volume mount command you provide on the node where the container is running. If your volume mount is local to that node, then your data will be saved locally on that node.

What is the docker volume?

Docker volumes are a widely used and useful tool for ensuring data persistence while working in containers. Docker volumes are file systems mounted on Docker containers to preserve data generated by the running container.

How can I see docker volumes?

View a Data Volume You can use the docker volume ls command to view a list of data volumes. Use the docker volume inspect command to view the data volume details.

What does docker volume create do?

Creates a new volume that containers can consume and store data in.


1 Answers

Here's a draft answer as I learned quite a bit since the moment this question was asked.

First it's important to understand what's a volume.

A volume is a way for docker to describe a mount point. When a volume get created, it doesn't actually get physically mounted anywhere until a container needs it.

So if you have a docker swarm and multiple nodes, when you create a volume, essentially the description of the volume gets replicated on each nodes but nothing else happens.

When a container boot up, it will try to mount a volume on the host its being booted up. If the volume wasn't physically present it will get mounted/created for the first time and reused there. So if you're using the local driver, it will essentially create the folder and that's it.

If you have multiple hosts, it means each host will create its own folder on demand.

So essentially a volume driver can be described by at least those 4 methods:

  1. mount
  2. unmount
  3. create
  4. delete

If you want to setup a swarm aware driver, the start would be to define a plugin that will describe a volume driver that implements those 4 methods. Drivers are implemented as http services that communicate with the docker daemon. The driver receives action and will eventually create or remove a folder for example.

So at this point, you should understand that in the end, a volume can be nothing more than a mountpoint. So anything that can be mounted can be used as a volume.

The issue thought is that even if you can mount a network drive, it's a very dumb process and the only thing you can do is mount something exists. So unless you implement your driver to do funky things like creating a remote mount before trying to mount it, you'll be forced to do things in a different way.

Let's take for example Amazon EFS. You can mount it as an NFS drive and it just works. But let say you want to share your NFS drive between different services... If you mount the root of your NFS drive in your volume, then it will not be possible to do that as the root will get shared between services and probably cause things to be visible by containers when it shouldn't.

One way that I found to implement that is to have your NFS drive structured as such:

  • / the root
  • /volumes/[volume_name] the volume you'd mount in a service

It's a simplified way to define but in short that could be done like this. Each volume would belong in a folder in a specified directory of the shared network drive.

But since the process is pretty dumb a simple mount of a folder that doesn't exist will fail and if you mount a volume pointed at /volumes/fun but /volumes/fun doesn't exist. No luck, docker isn't smart enough to create the /volumes/fun by default. This could be done by a network driver.

But luckily for us there's a way to do that without going into installing plugins etc...

One way to achieve volume creation on network drives is to have a "watchdog" service that checks when a volume is created. This service will mount the / of the remote drive (NFS, SSHFS or whatever it doesn't matter as long as you have write access).

Then it will listen for events or poll the docker daemon for volumes. If it find a volume that is tagged in some way to be mounted in the same drive it's watching. It will check if the mounted folder exists and if it doesn't it will create it.

If a container is started before the folder is created, it will simply fail to work but the moment the remote drive watchdog creates the folder, the service will boot up as if nothing happened.

In the end, it's quite possible to make "swarm" aware volumes without resorting to all sort of strange docker driver volume plugins implementations.

In the end a local driver is just a mountpoint. What's important is to find a way to mount something and have a way to know that the thing you try to mount exists and if it doesn't. You can create it easily by having access to the docker daemon.

One strategy would be to set labels on a volume like this:

  • swarm.volume.source : source label
  • swarm.volume.name : name of folder to mount in /volumes/...

This way you don't have to parse options arguments and you can directly use defined labels and the moment labels are defined it's easy to know which volume your watchdog actually care about.

But at this point it's pretty dependent on the infrastructure as you could create the folder on /volumes at the same time you create the volume.

So saying you need swarm aware drivers to have remote drives in docker swarm is a bit misleading. There is no such thing as "swarm" aware. All volumes are "swarm aware". It really end up of the result you're expecting.

like image 170
Loïc Faure-Lacroix Avatar answered Oct 05 '22 00:10

Loïc Faure-Lacroix