Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to deal with persistent storage (e.g. databases) in Docker

How do people deal with persistent storage for your Docker containers?

I am currently using this approach: build the image, e.g. for PostgreSQL, and then start the container with

docker run --volumes-from c0dbc34fd631 -d app_name/postgres 

IMHO, that has the drawback, that I must not ever (by accident) delete container "c0dbc34fd631".

Another idea would be to mount host volumes "-v" into the container, however, the userid within the container does not necessarily match the userid from the host, and then permissions might be messed up.

Note: Instead of --volumes-from 'cryptic_id' you can also use --volumes-from my-data-container where my-data-container is a name you assigned to a data-only container, e.g. docker run --name my-data-container ... (see the accepted answer)

like image 746
juwalter Avatar asked Aug 28 '13 19:08

juwalter


People also ask

How do you store data persistently in Docker?

Volumes are the best way to persist data in Docker. Bind mounts may be stored anywhere on the host system. They may even be important system files or directories. Non-Docker processes on the Docker host or a Docker container can modify them at any time.

What is persistent storage in Docker?

In docker, the persistent storage is dealt with the volume concept. Persistent storage is means when we are stopping or removing the container the data should be persistent. It will not delete automatically once the docker container is not available.

Can I store database in Docker?

With it you can control your data using the same tools you use for your stateless applications by harnessing the power of ZFS on Linux. This means that you can run your databases, queues and key-value stores in Docker and move them around as easily as the rest of your application.

Which Docker allows containers to run with persistent volumes?

Docker has an option to allow specific folders in a container to be mapped to the normal filesystem on the host. This allows us to have data in the container without making the data part of the Docker image, and without being bound to AUFS.


2 Answers

In Docker release v1.0, binding a mount of a file or directory on the host machine can be done by the given command:

$ docker run -v /host:/container ... 

The above volume could be used as a persistent storage on the host running Docker.

like image 33
amitmula Avatar answered Sep 22 '22 15:09

amitmula


Docker 1.9.0 and above

Use volume API

docker volume create --name hello docker run -d -v hello:/container/path/for/volume container_image my_command 

This means that the data-only container pattern must be abandoned in favour of the new volumes.

Actually the volume API is only a better way to achieve what was the data-container pattern.

If you create a container with a -v volume_name:/container/fs/path Docker will automatically create a named volume for you that can:

  1. Be listed through the docker volume ls
  2. Be identified through the docker volume inspect volume_name
  3. Backed up as a normal directory
  4. Backed up as before through a --volumes-from connection

The new volume API adds a useful command that lets you identify dangling volumes:

docker volume ls -f dangling=true 

And then remove it through its name:

docker volume rm <volume name> 

As @mpugach underlines in the comments, you can get rid of all the dangling volumes with a nice one-liner:

docker volume rm $(docker volume ls -f dangling=true -q) # Or using 1.13.x docker volume prune 

Docker 1.8.x and below

The approach that seems to work best for production is to use a data only container.

The data only container is run on a barebones image and actually does nothing except exposing a data volume.

Then you can run any other container to have access to the data container volumes:

docker run --volumes-from data-container some-other-container command-to-execute 
  • Here you can get a good picture of how to arrange the different containers.
  • Here there is a good insight on how volumes work.

In this blog post there is a good description of the so-called container as volume pattern which clarifies the main point of having data only containers.

Docker documentation has now the DEFINITIVE description of the container as volume/s pattern.

Following is the backup/restore procedure for Docker 1.8.x and below.

BACKUP:

sudo docker run --rm --volumes-from DATA -v $(pwd):/backup busybox tar cvf /backup/backup.tar /data 
  • --rm: remove the container when it exits
  • --volumes-from DATA: attach to the volumes shared by the DATA container
  • -v $(pwd):/backup: bind mount the current directory into the container; to write the tar file to
  • busybox: a small simpler image - good for quick maintenance
  • tar cvf /backup/backup.tar /data: creates an uncompressed tar file of all the files in the /data directory

RESTORE:

# Create a new data container $ sudo docker run -v /data -name DATA2 busybox true # untar the backup files into the new container᾿s data volume $ sudo docker run --rm --volumes-from DATA2 -v $(pwd):/backup busybox tar xvf /backup/backup.tar data/ data/sven.txt # Compare to the original container $ sudo docker run --rm --volumes-from DATA -v `pwd`:/backup busybox ls /data sven.txt 

Here is a nice article from the excellent Brian Goff explaining why it is good to use the same image for a container and a data container.

like image 130
tommasop Avatar answered Sep 25 '22 15:09

tommasop