Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multi-stage Docker: RUN wget vs ADD

Tags:

docker

The best practices section of the Docker docs says this:

Because image size matters, using ADD to fetch packages from remote URLs is strongly discouraged; you should use curl or wget instead. That way you can delete the files you no longer need after they’ve been extracted and you won’t have to add another layer in your image. For example, you should avoid doing things like:

ADD http://example.com/big.tar.xz /usr/src/things/
RUN tar -xJf /usr/src/things/big.tar.xz -C /usr/src/things
RUN make -C /usr/src/things all

And instead, do something like:

RUN mkdir -p /usr/src/things \
    && curl -SL http://example.com/big.tar.xz \
    | tar -xJC /usr/src/things \
    && make -C /usr/src/things all

On the other hand, later on it notes:

Prior to Docker 17.05, and even more, prior to Docker 1.10, it was important to minimize the number of layers in your image. [...] Docker 17.05 and higher add support for multi-stage builds, which allow you to copy only the artifacts you need into the final image.

and even

compress[ing] two RUN commands together using the Bash && operator [is] failure-prone and hard to maintain.

It seems to me that if you are using multi-stage builds, the advice about ADD is inaccurate. The extra layers are unlikely to be a problem unless you are downloading something truly huge, as local disk space is cheap and it's easy to clean out old images. Indeed, when coding one doesn't usually have build commands clean their intermediate artefacts to save space!

In addition, ADD has a major advantage over RUN wget: it detects when its target has changed.

Am I missing something, or do multi-stage builds rehabilitate ADD?

like image 687
Mohan Avatar asked Dec 09 '17 06:12

Mohan


People also ask

What is the difference between the add and COPY instructions in Dockerfile?

COPY is a docker file command that copies files from a local source location to a destination in the Docker container. ADD command is used to copy files/directories into a Docker image. It only has only one assigned function. It can also copy files from a URL.

What is the purpose of multi-stage builds in Docker?

A multistage build allows you to use multiple images to build a final product. In a multistage build, you have a single Dockerfile, but can define multiple images inside it to help build the final image.

Does Docker add unzip?

Unpacking local archivesWhen you build the image Docker will unpack the archive. Since the format of ADD is the exact same when you just copy a file or you unpack an archive, this might get tricky.

What is the purpose of multi-stage builds?

With multi-stage builds, you use multiple FROM statements in your Dockerfile. Each FROM instruction can use a different base, and each of them begins a new stage of the build. You can selectively copy artifacts from one stage to another, leaving behind everything you don't want in the final image.


1 Answers

It does look like it: ADD is used for instance in the Katakoda "Creating optimized Docker Images using Multi-Stage Builds" example (for the first image, before the second stage.

With two steps, you can focus on minimizing the (42-max for aufs, hard-limit 127) layers in the second stage only.

like image 59
VonC Avatar answered Sep 18 '22 19:09

VonC