I need the contents of a large *.zip
file (5 gb) in my Docker
container in order to compile a program. The *.zip
file resides on my local machine. The strategy for this would be:
COPY program.zip /tmp/
RUN cd /tmp \
&& unzip program.zip \
&& make
After having done this I would like to remove the unzipped directory and the original *.zip
file because they are not needed any more. The problem is that the COPY
(and also the ADD
directive) will add a layer to the image that will contain the file program.zip
which is problematic as may image will be at least 5gb big. Is there a way to add a file to a container without using COPY
or ADD
directive? wget
will not work as the mentioned *.zip
file is on my local machine and curl file://localhost/home/user/program.zip -o /tmp/program.zip
will not work either.
It is not straightforward but it can be done via wget
or curl
with a little support from python
. (All three tools should usually be available on a *nix
system.)
wget
will not work when no url
is given and
curl file://localhost/home/user/program.zip -o /tmp/
will not work from within a Dockerfile
's RUN
instruction. Hence, we will need a server which wget
and curl
can access and download program.zip
from.
To do this we set up a little python
server which serves our http
requests. We will be using the http.server
module from python
for this. (You can use python
or python 3
. It will work with both.).
python -m http.server --bind 192.168.178.20 8000
The python
server will serve all files in the directory it is started in. So you should make sure that you start your server either in the directory the file you want to download during your image build resides in or create a temporary directory which contains your program. For illustration purposes let's create the file foo.txt
which we will later download via wget
in our Dockerfile
:
echo "foo bar" > foo.txt
When starting the http server, it is important, that we specify the IP address of our local machine on the LAN. Furthermore, we will open Port 8000. Having done this we should see the following output:
python3 -m http.server --bind 192.168.178.20 8000
Serving HTTP on 192.168.178.20 port 8000 ...
Now we build a Dockerfile
to illustrate how this works. (We will assume that the file foo.txt
should be downloaded into /tmp
):
FROM debian:latest
RUN apt-get update -qq \
&& apt-get install -y wget
RUN cd /tmp \
&& wget http://192.168.178.20:8000/foo.txt
Now we start the build with
docker build -t test .
During the build you will see the following output on our python
server:
172.17.0.21 - - [01/Nov/2014 23:32:37] "GET /foo.txt HTTP/1.1" 200 -
and the build output of our image will be:
Step 2 : RUN cd /tmp && wget http://192.168.178.20:8000/foo.txt
---> Running in 49c10e0057d5
--2014-11-01 22:56:15-- http://192.168.178.20:8000/foo.txt
Connecting to 192.168.178.20:8000... connected.
HTTP request sent, awaiting response... 200 OK
Length: 25872 (25K) [text/plain]
Saving to: `foo.txt'
0K .......... .......... ..... 100% 129M=0s
2014-11-01 22:56:15 (129 MB/s) - `foo.txt' saved [25872/25872]
---> 5228517c8641
Removing intermediate container 49c10e0057d5
Successfully built 5228517c8641
You can then check if it really worked by starting and entering a container from the image you just build:
docker run -i -t --rm test bash
You can then look in /tmp
for foo.txt
.
We can now add any file to our image
without creating an new layer. Assuming you want to add a program of about 5 gb as mentioned in the question we could do:
FROM debian:latest
RUN apt-get update -qq \
&& apt-get install -y wget
RUN cd /tmp \
&& wget http://conventiont:8000/program.zip \
&& unzip program.zip \
&& cd program \
&& make \
&& make install \
&& cd /tmp \
&& rm -f program.zip \
&& rm -rf program
In this way we will not be left with 10 gb of cruft.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With