Docker. How to resume downloading image when interrupted?

Tags:

docker

How can I resume pull when disconnected? The pull process always start from the beginning every time I run docker pull some-image again after disconnected. My connection is so unstable that even downloading just a 100MB image take so long and almost fails every time. So, it is almost impossible for me to pull a bigger image. So, how can I resume the pull process?

405

asked Feb 10 '16 12:02

Mas Bagol

2 Answers

Update:

The pull process will now automatically resume based on which layers have already been downloaded. This was implemented with https://github.com/moby/moby/pull/18353.

Old:

There is no resume feature yet. However there are discussions around this feature being implemented with docker's download manager.

151

answered Oct 19 '22 08:10

michaelbahr

Docker's code isn't as updated as the moby in development repository on github. People have been having issues for several years relating to this. I had tried to manually use several patches which aren't in the upstream yet, and none worked decent.

The github repository for moby (docker's development repo) has a script called download-frozen-image-v2.sh. This script uses bash, curl, and other things like JSON interpreters via command line. It will retrieve a docker token, and then download all of the layers to a local directory. You can then use 'docker load' to insert into your local docker installation.

It does not do well with resume though. It had some comment in the script relating to 'curl -C' isn't working. I had tracked down, and fixed this problem. I made a modification which uses a ".headers" file to retrieve initially, which has always returned a 302 while I've been monitoring, and then retrieves the final using curl (+ resume support) to the layer tar file. It also has to loop on the calling function which retrieves a valid token which unfortunately only lasts about 30 minutes.

It will loop this process until it receives a 416 stating that there is no resume possible since it's ranges have been fulfilled. It also verifies the size against a curl header retrieval. I have been able to retrieve all images necessary using this modified script. Docker has many more layers relating to retrieval, and has remote control processes (Docker client) which make it more difficult to control, and they viewed this issue as only affecting some people on bad connections.

I hope this script can help you as much as it has helped me:

Changes: fetch_blob function uses a temporary file for its first connection. It then retrieves 30x HTTP redirect from this. It attempts a header retrieval on the final url and checks whether the local copy has the full file. Otherwise, it will begin a resume curl operation. The calling function which passes it a valid token has a loop surrounding retrieving a token, and fetch_blob which ensures the full file is obtained.

The only other variation is a bandwidth limit variable which can be set at the top, or via "BW:10" command line parameter. I needed this to allow my connection to be viable for other operations.

Click here for the modified script.

In the future it would be nice if docker's internal client performed resuming properly. Increasing the amount of time for the token's validation would help tremendously..

Brief views of change code:

#loop until FULL_FILE is set in fetch_blob.. this is for bad/slow connections
            while [ "$FULL_FILE" != "1" ];do
                local token="$(curl -fsSL "$authBase/token?service=$authService&scope=repository:$image:pull" | jq --raw-output '.token')"
                fetch_blob "$token" "$image" "$layerDigest" "$dir/$layerTar" --progress
                sleep 1
            done

Another section from fetch_blob:

while :; do
            #if the file already exists.. we will be resuming..
            if [ -f "$targetFile" ];then
                #getting current size of file we are resuming
                CUR=`stat --printf="%s" $targetFile`
                #use curl to get headers to find content-length of the full file
                LEN=`curl -I -fL "${curlArgs[@]}" "$blobRedirect"|grep content-length|cut -d" " -f2`

                #if we already have the entire file... lets stop curl from erroring with 416
                if [ "$CUR" == "${LEN//[!0-9]/}" ]; then
                    FULL_FILE=1
                    break
                fi
            fi

            HTTP_CODE=`curl -w %{http_code} -C - --tr-encoding --compressed --progress-bar -fL "${curlArgs[@]}" "$blobRedirect" -o "$targetFile"`
            if [ "$HTTP_CODE" == "403" ]; then
                #token expired so the server stopped allowing us to resume, lets return without setting FULL_FILE and itll restart this func w new token
                FULL_FILE=0
                break
            fi

            if [ "$HTTP_CODE" == "416" ]; then
                FULL_FILE=1
                break
            fi

            sleep 1
        done

answered Oct 19 '22 09:10

Mike Guidry

Related questions
                            
                                Docker containers with multiple log sources
                            
                                How to access a web application running on Mesos?
                            
                                How to automatically remove old Docker images?
                            
                                Apache in Docker won't deliver sites
                            
                                Download key with `gpg --recv-key` and simultaneously check fingerprint in a script
                            
                                Docker command to fetch dockerfile from registry
                            
                                Access control for private docker registry
                            
                                Fig up: Cannot find module - docker run works
                            
                                Is it possible to use docker with jrebel or dcevm
                            
                                Copy files from host to docker container then commit and push
                            
                                Development and production with docker with multiple sites
                            
                                Docker 1.5 on IPv6 only host
                            
                                AWS BeansTalk expose docker port
                            
                                Restarting Containers When Using Docker and Nginx proxy_pass
                            
                                Docker Weave and WeaveDNS issues
                            
                                Docker automated build shows empty Dockerfile
                            
                                Use environment vars of container in command key of docker-compose
                            
                                Dockerfile copy files from amazon s3 or another source that needs credentials
                            
                                Is there any way to start a Docker container in detached mode?
                            
                                docker installation failed on Ubuntu 20.04 LTS(Vmware)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With