Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what does VOLUME inside Dockerfile do

I am trying to understand the below DockerFile:

https://github.com/nfnty/dockerfiles/blob/master/images/arch-bootstrap/latest/Dockerfile

FROM nfnty/arch-mini:latest
.....
RUN     install --directory --owner=root --group=root --mode=700 /var/lib/bootstrap/{,archive}

USER root
VOLUME ["/var/lib/bootstrap"]
ENTRYPOINT ["/opt/bootstrap/build.sh"]

RUN is creating directory /var/lib/bootstrap/archive and after build the image will be having this folder permanenantly

When the container is created from it, it will have the folder "/var/lib/bootstrap/archive" because its existing in the image.

What is the point of declaring VOLUME /var/lib/bootstrap/

I can understand in command line -v [host path]:[container:path] will mount the host folder on the containers folder.

but what is Volumne in dockerfile especially in the above case do.

Ok I am showing some test i have done:

-- trying to create a container with dockerfile above
i.e VOLUME ["/var/lib/bootstrap"]

hostsystem#  docker run -it --entrypoint=/bin/bash nfnty/arch-bootstrap
[root@684120b46cfb /]# ls -al /var/lib/bootstrap/
total 12
drwx------ 3 root root 4096 Oct 18 05:53 .
drwxr-xr-x 1 root root 4096 Aug 23 12:48 ..
drwx------ 2 root root 4096 Aug 23 12:48 archive

-- I have created a sample001.txt file inside it.
[root@684120b46cfb /]# touch /var/lib/bootstrap/sample001.txt
[root@684120b46cfb /]# ls -al /var/lib/bootstrap/
total 12
drwx------ 3 root root 4096 Oct 18 05:54 .
drwxr-xr-x 1 root root 4096 Aug 23 12:48 ..
drwx------ 2 root root 4096 Aug 23 12:48 archive
-rw-r--r-- 1 root root    0 Oct 18 05:54 sample001.txt
[root@684120b46cfb /]# 

[root@684120b46cfb /]# exit

-- As per [@izazkhan answer][1] the VOLUME ["/var/lib/bootstrap"]
instruction is persisting the data by creating a volume in 
/var/lib/docker on the host and mount it on /var/lib/bootstrap 
in the container. So expect the sample001.txt lies there at 
/var/lib/docker/(var/lib/bootstrap)

-- Now again trying to create a container
hostsystem#  docker run -it --entrypoint=/bin/bash nfnty/arch-bootstrap
[root@5fa7c4fc72e2 /]# ls -al /var/lib/bootstrap/
total 12
drwx------ 3 root root 4096 Oct 18 06:00 .
drwxr-xr-x 1 root root 4096 Aug 23 12:48 ..
drwx------ 2 root root 4096 Aug 23 12:48 archive
[root@5fa7c4fc72e2 /]# 

-- i dont see my sample001.txt file here.

And i check the dockers:

#  docker container ls -a
CONTAINER ID        IMAGE                  COMMAND             CREATED             STATUS                      PORTS               NAMES
ff2d37e5399a        nfnty/arch-bootstrap   "/bin/bash"         16 seconds ago      Exited (0) 3 seconds ago                        stupefied_sinoussi
bfbff0778fe9        nfnty/arch-bootstrap   "/bin/bash"         7 minutes ago       Exited (0) 30 seconds ago                       objective_noether


And i check the volumes:

#  docker volume ls
DRIVER              VOLUME NAME
local               47ae26f1b4b17cd2792972b50dcae9da9af1d3f06ccd984cfbf5a75be7365bbd
local               fd5a1caf07024f7103a3a225f4de00a2c1efb79a74fa939737f11c939837b32a

What i found is there are two volumes since i have created two containers.

-- Also checking the volume folders:

# cd /var/lib/docker/volumes

#  find . -exec ls -dl \{\} \; | awk '{print $3, $4, $9}'
root root .
root root ./47ae26f1b4b17cd2792972b50dcae9da9af1d3f06ccd984cfbf5a75be7365bbd
root root ./47ae26f1b4b17cd2792972b50dcae9da9af1d3f06ccd984cfbf5a75be7365bbd/_data
root root ./47ae26f1b4b17cd2792972b50dcae9da9af1d3f06ccd984cfbf5a75be7365bbd/_data/archive
root root ./fd5a1caf07024f7103a3a225f4de00a2c1efb79a74fa939737f11c939837b32a
root root ./fd5a1caf07024f7103a3a225f4de00a2c1efb79a74fa939737f11c939837b32a/_data
root root ./fd5a1caf07024f7103a3a225f4de00a2c1efb79a74fa939737f11c939837b32a/_data/archive
root root ./fd5a1caf07024f7103a3a225f4de00a2c1efb79a74fa939737f11c939837b32a/_data/sample001.txt
root root ./metadata.db

I was expecting sample001.txt to be see in any container. i.e All containers are using the same volume folder. but it looks like they are having different folders created at /var/lib/docker/volumes eventhought the mount point is as defined by the VOLUME in dockerfile.

I was confused that VOLUME in dockerfile is refering to one single folder on the host machine at /var/lib/docker/volumes irrespective of how many contianers we create. But its not true since they have different host folders at /var/lib/docker/volumes for each container.

Then what is the purpose of VOLUMES. One help i feel is if i create some files in the container and store them at the VOLUME place and i want to access them from host then i can go and check the volume folders.

But the names of volume folders are not easy to figure out which container they belong to.

Sorry i am completely new to volumes in docker, i new -v [host]:[container], but first time came across VOLUME in dockerfile. So i was completely confused and unable to figure out whats happening.

After reading https://docs.docker.com/storage/volumes/ i found the answer for why VOLUME

In addition, volumes are often a better choice than persisting data in a container’s writable layer, because a volume does not increase the size of the containers using it, and the volume’s contents exist outside the lifecycle of a given container.

Also the below link helps to know how to have a common volume and use it in different container (not the same as my question)

https://linuxhint.com/storing-sharing-docker-volumes/

like image 916
Santhosh Avatar asked Oct 17 '18 19:10

Santhosh


3 Answers

Docker Volumes:

Volumes decouple the life of the data being stored in them from the life of the container that created them. This makes it so you can docker rm my_container and your data will not be removed.

A volume can be created in two ways:

Specifying VOLUME /some/dir in a Dockerfile

Specying it as part of your run command as docker run -v /some/dir

Either way, these two things do exactly the same thing. It tells Docker to create a directory on the host, within the docker root path (by default /var/lib/docker), and mount it to the path you've specified (/some/dir above). When you remove the container using this volume, the volume itself continues to live on.

If the path specified does not exist within the container, a directory will be automatically created.

You can tell docker to remove a volume along with the container:

docker rm -v my_container

Sometimes you've already got a directory on your host that you want to use in the container, so the CLI has an extra option for specifying this:

docker run -v /host/path:/some/path ...

This tells docker to use the specified host path specifically, instead of creating one itself within the docker root, and mount that to the specified path within the container (/some/path above).

Note, that this can also be a file instead of a directory. This is commonly referred to as a bind-mount within docker terminology (though technically speaking, all volumes are bind-mounts in the sense of what is actually happening). If the path on the host does not exist, a directory will be automatically be created at the given path.

From the docker documentation:

VOLUME ["/data"]

The VOLUME instruction creates a mount point with the specified name and marks it as holding externally mounted volumes from native host or other containers. The value can be a JSON array, VOLUME ["/var/log/"], or a plain string with multiple arguments, such as VOLUME /var/log or VOLUME /var/log /var/db. For more information/examples and mounting instructions via the Docker client, refer to Share Directories via Volumes documentation.

The docker run command initializes the newly created volume with any data that exists at the specified location within the base image. For example, consider the following Dockerfile snippet:

FROM ubuntu
RUN mkdir /myvol
RUN echo "hello world" > /myvol/greeting
VOLUME /myvol

This Dockerfile results in an image that causes docker run to create a new mount point at /myvol and copy the greeting file into the newly created volume.

Answer:

So in the above case , the VOLUME ["/var/lib/bootstrap"] instruction is persisting the data by creating a volume in /var/lib/docker on the host and mount it on /var/lib/bootstrap in the container.

Notes about specifying volumes

Keep the following things in mind about volumes in the Dockerfile.

Volumes on Windows-based containers: When using Windows-based containers, the destination of a volume inside the container must be one of:

  • a non-existing or empty directory
  • a drive other than C:

Changing the volume from within the Dockerfile: If any build steps change the data within the volume after it has been declared, those changes will be discarded.

JSON formatting: The list is parsed as a JSON array. You must enclose words with double quotes (")rather than single quotes (').

The host directory is declared at container run-time: The host directory (the mountpoint) is, by its nature, host-dependent. This is to preserve image portability, since a given host directory can’t be guaranteed to be available on all hosts. For this reason, you can’t mount a host directory from within the Dockerfile. The VOLUME instruction does not support specifying a host-dir parameter. You must specify the mountpoint when you create or run the container.

like image 71
Ijaz Ahmad Avatar answered Oct 21 '22 15:10

Ijaz Ahmad


The VOLUME directive in Dockerfile means:

Every container created from this image will have its own exclusive directory to persist its data. If you do not override this with your own directory or remote mount, dockerd will assign a random directory in your host under /var/lib/docker/volumes.

Even though you can, Docker does not make the assumption that volume data will be shared between containers. Rather, it assumes that the data inside the volume is meant to be used only by the application instance running in one specific container.

That is usually the case for most applications that persist anything. For example, you cannot have two MySQL instances competing to store data into the same /var/lib/mysql directory. Distributed key value stores, such as etcd, do not need to share persisted data. Each instance will sync/clone when it joins a cluster.

I, myself, find the VOLUME directive a bit intrusive because of these assumptions. It makes the assumption that everytime I run a container, I want data to be persisted. For development/testing purposes, that's almost never the case. As a developer, I need to constantly clean up after containers that insist on leaving volumes dangling around.

While the opposite is true for production environments, this default "random directory inside /var/lib/docker/volumes" is almost never a great idea. Cloud solutions that automatically provision volumes will most certainly provision that in some other location. For everything else, you want to control where and how volumes are created. From my experience, in a Kubernetes cluster, /var/lib/docker/volumes is pretty much useless.

like image 29
JulioHM Avatar answered Oct 21 '22 15:10

JulioHM


Regarding to your edited question. The "VOLUME" is just a mount point in your container. And any change inside your container, will just stick with your container. So when you create a new container with the same image, it just creates a new volume for you which is isolated with the previous one. If you want you use the volume of container#1 in new container, you need run the new container with "--volumes-from" option,

docker run -it --volumes-from bfbff0778fe9 --entrypoint=/bin/bash nfnty/arch-bootstrap

I have done simple test,

root@docker:~/test# cat Dockerfile
FROM alpine
RUN mkdir /test
VOLUME /test
root@docker:~/test# docker build -t test .
root@docker:~/test# docker run -it test sh
/ # cd test
/test # touch hello.txt
/test # exit
root@docker:~/test# docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS                      PORTS               NAMES
d11aa6a4ace1        test                "sh"                44 seconds ago      Exited (0) 31 seconds ago                       compassionate_swanson
root@docker:~/test# docker run -it --volumes-from d11aa6a4ace1 alpine sh
/ # ls /test/
hello.txt
like image 22
yuanli Avatar answered Oct 21 '22 14:10

yuanli