I am trying to understand the below DockerFile:
https://github.com/nfnty/dockerfiles/blob/master/images/arch-bootstrap/latest/Dockerfile
FROM nfnty/arch-mini:latest
.....
RUN install --directory --owner=root --group=root --mode=700 /var/lib/bootstrap/{,archive}
USER root
VOLUME ["/var/lib/bootstrap"]
ENTRYPOINT ["/opt/bootstrap/build.sh"]
RUN is creating directory /var/lib/bootstrap/archive and after build the image will be having this folder permanenantly
When the container is created from it, it will have the folder "/var/lib/bootstrap/archive" because its existing in the image.
What is the point of declaring VOLUME /var/lib/bootstrap/
I can understand in command line -v [host path]:[container:path] will mount the host folder on the containers folder.
but what is Volumne in dockerfile especially in the above case do.
Ok I am showing some test i have done:
-- trying to create a container with dockerfile above
i.e VOLUME ["/var/lib/bootstrap"]
hostsystem# docker run -it --entrypoint=/bin/bash nfnty/arch-bootstrap
[root@684120b46cfb /]# ls -al /var/lib/bootstrap/
total 12
drwx------ 3 root root 4096 Oct 18 05:53 .
drwxr-xr-x 1 root root 4096 Aug 23 12:48 ..
drwx------ 2 root root 4096 Aug 23 12:48 archive
-- I have created a sample001.txt file inside it.
[root@684120b46cfb /]# touch /var/lib/bootstrap/sample001.txt
[root@684120b46cfb /]# ls -al /var/lib/bootstrap/
total 12
drwx------ 3 root root 4096 Oct 18 05:54 .
drwxr-xr-x 1 root root 4096 Aug 23 12:48 ..
drwx------ 2 root root 4096 Aug 23 12:48 archive
-rw-r--r-- 1 root root 0 Oct 18 05:54 sample001.txt
[root@684120b46cfb /]#
[root@684120b46cfb /]# exit
-- As per [@izazkhan answer][1] the VOLUME ["/var/lib/bootstrap"]
instruction is persisting the data by creating a volume in
/var/lib/docker on the host and mount it on /var/lib/bootstrap
in the container. So expect the sample001.txt lies there at
/var/lib/docker/(var/lib/bootstrap)
-- Now again trying to create a container
hostsystem# docker run -it --entrypoint=/bin/bash nfnty/arch-bootstrap
[root@5fa7c4fc72e2 /]# ls -al /var/lib/bootstrap/
total 12
drwx------ 3 root root 4096 Oct 18 06:00 .
drwxr-xr-x 1 root root 4096 Aug 23 12:48 ..
drwx------ 2 root root 4096 Aug 23 12:48 archive
[root@5fa7c4fc72e2 /]#
-- i dont see my sample001.txt file here.
And i check the dockers:
# docker container ls -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
ff2d37e5399a nfnty/arch-bootstrap "/bin/bash" 16 seconds ago Exited (0) 3 seconds ago stupefied_sinoussi
bfbff0778fe9 nfnty/arch-bootstrap "/bin/bash" 7 minutes ago Exited (0) 30 seconds ago objective_noether
And i check the volumes:
# docker volume ls
DRIVER VOLUME NAME
local 47ae26f1b4b17cd2792972b50dcae9da9af1d3f06ccd984cfbf5a75be7365bbd
local fd5a1caf07024f7103a3a225f4de00a2c1efb79a74fa939737f11c939837b32a
What i found is there are two volumes since i have created two containers.
-- Also checking the volume folders:
# cd /var/lib/docker/volumes
# find . -exec ls -dl \{\} \; | awk '{print $3, $4, $9}'
root root .
root root ./47ae26f1b4b17cd2792972b50dcae9da9af1d3f06ccd984cfbf5a75be7365bbd
root root ./47ae26f1b4b17cd2792972b50dcae9da9af1d3f06ccd984cfbf5a75be7365bbd/_data
root root ./47ae26f1b4b17cd2792972b50dcae9da9af1d3f06ccd984cfbf5a75be7365bbd/_data/archive
root root ./fd5a1caf07024f7103a3a225f4de00a2c1efb79a74fa939737f11c939837b32a
root root ./fd5a1caf07024f7103a3a225f4de00a2c1efb79a74fa939737f11c939837b32a/_data
root root ./fd5a1caf07024f7103a3a225f4de00a2c1efb79a74fa939737f11c939837b32a/_data/archive
root root ./fd5a1caf07024f7103a3a225f4de00a2c1efb79a74fa939737f11c939837b32a/_data/sample001.txt
root root ./metadata.db
I was expecting sample001.txt to be see in any container. i.e All containers are using the same volume folder. but it looks like they are having different folders created at /var/lib/docker/volumes eventhought the mount point is as defined by the VOLUME in dockerfile.
I was confused that VOLUME in dockerfile is refering to one single folder on the host machine at /var/lib/docker/volumes irrespective of how many contianers we create. But its not true since they have different host folders at /var/lib/docker/volumes for each container.
Then what is the purpose of VOLUMES. One help i feel is if i create some files in the container and store them at the VOLUME place and i want to access them from host then i can go and check the volume folders.
But the names of volume folders are not easy to figure out which container they belong to.
Sorry i am completely new to volumes in docker, i new -v [host]:[container], but first time came across VOLUME in dockerfile. So i was completely confused and unable to figure out whats happening.
After reading https://docs.docker.com/storage/volumes/ i found the answer for why VOLUME
In addition, volumes are often a better choice than persisting data in a container’s writable layer, because a volume does not increase the size of the containers using it, and the volume’s contents exist outside the lifecycle of a given container.
Also the below link helps to know how to have a common volume and use it in different container (not the same as my question)
https://linuxhint.com/storing-sharing-docker-volumes/
Docker Volumes:
Volumes decouple the life of the data being stored in them from the life of the container that created them. This makes it so you can docker rm my_container
and your data will not be removed.
A volume can be created in two ways:
Specifying VOLUME /some/dir
in a Dockerfile
Specying it as part of your run command as docker run -v /some/dir
Either way, these two things do exactly the same thing. It tells Docker to create a directory on the host, within the docker root path (by default /var/lib/docker
), and mount it to the path you've specified (/some/dir
above). When you remove the container using this volume, the volume itself continues to live on.
If the path specified does not exist within the container, a directory will be automatically created.
You can tell docker to remove a volume along with the container:
docker rm -v my_container
Sometimes you've already got a directory on your host that you want to use in the container, so the CLI has an extra option for specifying this:
docker run -v /host/path:/some/path ...
This tells docker to use the specified host path specifically, instead of creating one itself within the docker root, and mount that to the specified path within the container (/some/path
above).
Note, that this can also be a file instead of a directory. This is commonly referred to as a bind-mount within docker terminology (though technically speaking, all volumes are bind-mounts in the sense of what is actually happening). If the path on the host does not exist, a directory will be automatically be created at the given path.
From the docker documentation:
VOLUME ["/data"]
The VOLUME
instruction creates a mount point with the specified name and marks it as holding externally mounted volumes from native host or other containers. The value can be a JSON array, VOLUME ["/var/log/"]
, or a plain string with multiple arguments, such as VOLUME /var/log
or VOLUME /var/log /var/db
. For more information/examples and mounting instructions via the Docker client, refer to Share Directories via Volumes documentation.
The docker run command initializes the newly created volume with any data that exists at the specified location within the base image. For example, consider the following Dockerfile snippet:
FROM ubuntu
RUN mkdir /myvol
RUN echo "hello world" > /myvol/greeting
VOLUME /myvol
This Dockerfile results in an image that causes docker run to create a new mount point at /myvol
and copy the greeting file into the newly created volume.
Answer:
So in the above case , the VOLUME ["/var/lib/bootstrap"]
instruction is persisting the data by creating a volume in /var/lib/docker
on the host and mount it on /var/lib/bootstrap
in the container.
Notes about specifying volumes
Keep the following things in mind about volumes in the Dockerfile.
Volumes on Windows-based containers: When using Windows-based containers, the destination of a volume inside the container must be one of:
Changing the volume from within the Dockerfile: If any build steps change the data within the volume after it has been declared, those changes will be discarded.
JSON formatting: The list is parsed as a JSON array. You must enclose words with double quotes (")rather than single quotes (').
The host directory is declared at container run-time: The host directory (the mountpoint) is, by its nature, host-dependent. This is to preserve image portability, since a given host directory can’t be guaranteed to be available on all hosts. For this reason, you can’t mount a host directory from within the Dockerfile. The VOLUME
instruction does not support specifying a host-dir parameter. You must specify the mountpoint when you create or run the container.
The VOLUME
directive in Dockerfile means:
Every container created from this image will have its own exclusive directory to persist its data. If you do not override this with your own directory or remote mount, dockerd will assign a random directory in your host under
/var/lib/docker/volumes
.
Even though you can, Docker does not make the assumption that volume data will be shared between containers. Rather, it assumes that the data inside the volume is meant to be used only by the application instance running in one specific container.
That is usually the case for most applications that persist anything. For example, you cannot have two MySQL instances competing to store data into the same /var/lib/mysql
directory. Distributed key value stores, such as etcd
, do not need to share persisted data. Each instance will sync/clone when it joins a cluster.
I, myself, find the VOLUME
directive a bit intrusive because of these assumptions. It makes the assumption that everytime I run a container, I want data to be persisted. For development/testing purposes, that's almost never the case. As a developer, I need to constantly clean up after containers that insist on leaving volumes dangling around.
While the opposite is true for production environments, this default "random directory inside /var/lib/docker/volumes
" is almost never a great idea. Cloud solutions that automatically provision volumes will most certainly provision that in some other location. For everything else, you want to control where and how volumes are created. From my experience, in a Kubernetes cluster, /var/lib/docker/volumes
is pretty much useless.
Regarding to your edited question. The "VOLUME" is just a mount point in your container. And any change inside your container, will just stick with your container. So when you create a new container with the same image, it just creates a new volume for you which is isolated with the previous one. If you want you use the volume of container#1 in new container, you need run the new container with "--volumes-from" option,
docker run -it --volumes-from bfbff0778fe9 --entrypoint=/bin/bash nfnty/arch-bootstrap
I have done simple test,
root@docker:~/test# cat Dockerfile
FROM alpine
RUN mkdir /test
VOLUME /test
root@docker:~/test# docker build -t test .
root@docker:~/test# docker run -it test sh
/ # cd test
/test # touch hello.txt
/test # exit
root@docker:~/test# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
d11aa6a4ace1 test "sh" 44 seconds ago Exited (0) 31 seconds ago compassionate_swanson
root@docker:~/test# docker run -it --volumes-from d11aa6a4ace1 alpine sh
/ # ls /test/
hello.txt
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With