I am running several Docker containers on three machines, composing a Swarm cluster.
Some containers that stores persistent data(like DB, Redis, etc) use data volumes. (I tried to avoid using bind-mount as far as I can)
Such data volumes are located in /var/lib/docker/volumes/, and every volumes are assigned customized name rather than random-sequence-ID:
# ls /var/lib/docker/volumes/
redis-data postgres-data fluentd-data ...
I want to backup these volumes periodically, daily for example, so that I could restore when a machine failure occurs and fixed later.
However, every document I found in google illustrated the way to use new Linux container and tar
:
https://docs.docker.com/storage/volumes/#backup-restore-or-migrate-data-volumes
$ docker run --rm --volumes-from dbstore -v $(pwd):/backup ubuntu tar cvf /backup/backup.tar /dbdata
Why? Is there any problem if I simply archive /var/lib/docker/volumes/VOLUME
directory and copy it to other machine? For example, permission, uid, gid, etc?
$ tar -zcvf redis.tgz /var/lib/docker/volumes/redis-data
P.S.
There would be a case that the backup using tar
could cause data inconsistency due to changes in data during archiving. For example, archiving DB data directory when DB is still running and insert
s or update
s are performed... But I think this problem is applied to both approaches in same way.
A named volume can store data outside of /var/lib/docker. E.g. you can create a named bind mount with:
$ docker volume create --driver local \
--opt type=none \
--opt device=/home/user/test \
--opt o=bind \
test_vol
or here's one for an NFS mount:
$ docker volume create --driver local \
--opt type=nfs \
--opt o=nfsvers=4,addr=nfs.example.com,rw \
--opt device=:/path/to/dir \
foo
In these scenarios, the tar backup accesses the data the same way your container does, and therefore performs a backup regardless of how the named volume was created. It also effectively exports the data to a common format that can be used not only by other containers, but anywhere you happen to move your application.
If you find yourself needing more control over the volume contents, for more direct backups, then the named bind mount is a mid-way point between named volumes and host mounts. You get to treat the directory as a named volume to the container, but the contained data as just another directory on the host to backup.
Personally, I tend to treat /var/lib/docker as a black box. While the contents are very readable, docker is free to migrate and change things in there between versions, while the API used by users should remain more consistent. The fewer things I need to change should they transition to something like the containerd image management, the better.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With