Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Docker ADD vs VOLUME

Tags:

docker

People also ask

What is Docker add?

The ADD command is used to copy files/directories into a Docker image. It can copy data in three ways: Copy files from the local storage to a destination in the Docker image. Copy a tarball from the local storage and extract it automatically inside a destination in the Docker image.

What is difference between ADD and copy in Docker?

COPY is a docker file command that copies files from a local source location to a destination in the Docker container. ADD command is used to copy files/directories into a Docker image. It only has only one assigned function. It can also copy files from a URL.

Can I add volume in Dockerfile?

You cannot use files from your VOLUME directory in your Dockerfile. Anything in your volume directory will not be accessible at build-time but will be accessible at run-time. A few examples of cases where you'd want to use VOLUME : The app being run in your container makes logs in /var/log/my_app .

What is a volume Docker?

Docker volumes are file systems mounted on Docker containers to preserve data generated by the running container. The volumes are stored on the host, independent of the container life cycle. This allows users to back up data and share file systems between containers easily.


ADD

The fundamental difference between these two is that ADD makes whatever you're adding, be it a folder or just a file actually part of your image. Anyone who uses the image you've built afterwards will have access to whatever you ADD. This is true even if you afterwards remove it because Docker works in layers and the ADD layer will still exist as part of the image. To be clear, you only ADD something at build time and cannot ever ADD at run-time.

A few examples of cases where you'd want to use ADD:

  • You have some requirements in a requirements.txt file that you want to reference and install in your Dockerfile. You can then do: ADD ./requirements.txt /requirements.txt followed by RUN pip install -r /requirements.txt
  • You want to use your app code as context in your Dockerfile, for example, if you want to set your app directory as the working dir in your image and to have the default command in a container run from your image actually run your app, you can do:

    ADD ./ /usr/local/git/my_app

    WORKDIR /usr/local/git/my_app

    CMD python ./main.py

VOLUME

Volume, on the other hand, just lets a container run from your image have access to some path on whatever local machine the container is being run on. You cannot use files from your VOLUME directory in your Dockerfile. Anything in your volume directory will not be accessible at build-time but will be accessible at run-time.

A few examples of cases where you'd want to use VOLUME:

  • The app being run in your container makes logs in /var/log/my_app. You want those logs to be accessible on the host machine and not to be deleted when the container is removed. You can do this by creating a mount point at /var/log/my_app by adding VOLUME /var/log/my_app to your Dockerfile and then running your container with docker run -v /host/log/dir/my_app:/var/log/my_app some_repo/some_image:some_tag
  • You have some local settings files you want the app in the container to have access to. Perhaps those settings files are different on your local machine vs dev vs production. Especially so if those settings files are secret, in which case you definitely do not want them in your image. A good strategy in that case is to add VOLUME /etc/settings/my_app_settings to your Dockerfile, run your container with docker run -v /host/settings/dir:/etc/settings/my_app_settings some_repo/some_image:some_tag, and make sure the /host/settings/dir exists in all environments you expect your app to be run.

The VOLUME instruction creates a data volume in your Docker container at runtime. The directory provided as an argument to VOLUME is a directory that bypasses the Union File System, and is primarily used for persistent and shared data.

If you run docker inspect <your-container>, you will see under the Mounts section there is a Source which represents the directory location on the host, and a Destination which represents the mounted directory location in the container. For example,

"Mounts": [
  {
    "Name": "fac362...80535",
    "Source": "/var/lib/docker/volumes/fac362...80535/_data",
    "Destination": "/webapp",
    "Driver": "local",
    "Mode": "",
    "RW": true,
    "Propagation": ""
  }
]

Here are 3 use cases for docker run -v:

  1. docker run -v /data: This is analogous to specifying the VOLUME instruction in your Dockerfile.
  2. docker run -v $host_path:$container_path: This allows you to mount $host_path from your host to $container_path in your container during runtime. In development, this is useful for sharing source code on your host with the container. In production, this can be used to mount things like the host's DNS information (found in /etc/resolv.conf) or secrets into the container. Conversely, you can also use this technique to write the container's logs into specific folders on the host. Both $host_path and $container_path must be absolute paths.
  3. docker run -v my_volume:$container_path: This creates a data volume in your container at $container_path and names it my_volume. It is essentially the same as creating and naming a volume using docker volume create my_volume. Naming a volume like this is useful for a container data volume and a shared-storage volume using a multi-host storage driver like Flocker.

Notice that the approach of mounting a host folder as a data volume is not available in Dockerfile. To quote the docker documentation,

Note: This is not available from a Dockerfile due to the portability and sharing purpose of it. As the host directory is, by its nature, host-dependent, a host directory specified in a Dockerfile probably wouldn't work on all hosts.

Now if you want to copy your files to containers in non-development environments, you can use the ADD or COPY instructions in your Dockerfile. These are what I usually use for non-development deployment.