Given an online file which I can download via my web browser.
I run a curl
on it, with
mkdir -p ./data
curl -L -C - 'http://www.ngdc.noaa.gov/mgg/global/relief/ETOPO1/data/ice_surface/grid_registered/netcdf/readme_etopo1_netcdf.txt' -o ./data/countries.zip
I optain the following error message:
curl: (33) HTTP server doesn't seem to support byte ranges. Cannot resume.
How to fix that ? Other downloading tools welcome.
Note:
-L
: follows redirects-C -
: continues previously unfinished downloadEdit: this error message appears when the file to download already exist AND is already complete. It also stop the ongoing script. My requirement are :
How could I do so ?
I tried running this command twice:
curl -L -C - 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_sovereignty.zip' -o countries.zip
and got the following output:
$ curl -L -C - 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_sovereignty.zip' -o countries.zip
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 5225k 100 5225k 0 0 720k 0 0:00:07 0:00:07 --:--:-- 836k
$ curl -L -C - 'http://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_0_sovereignty.zip' -o countries.zip
** Resuming transfer from byte position 5351381
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
$ echo $?
0
So, it looks like the "resume" is working fine. Since you posted the question back in May, it is entirely possible that cURL has fixed their bug or that the webserver in question has updated their support for HTTP range requests.
As you noted in the comments, the bug still occurs with the ngdc.noaa.gov website. I checked my curl and it is doing the same thing. Therefore the bug is still in curl.
With Wireshark I checked what is going on in the HTTP protocol. Basically, when curl makes the request to resume the completed file, the server sends back an HTTP 416 error ("Requested Range Not Satisfiable"). In the case of naturalearthdata.com, the CDN they use adds a Content-Range header specifying the exact length of the file. ngdc.noaa.gov does not add this header. Note that the addition of Content-Range in HTTP 416 responses is optional per RFC 2616.
curl uses Content-Range to determine if the download is complete. If the header is missing, curl assumes that the server doesn't support range downloads and spits out that error message.
I've reported this as a bug to the libcurl mailing list. We'll see what they say. In the meantime, here are two possible workarounds:
aria2c
often, which is a very nice command-line download utility with support for multiple connections and resumed downloads. It may make your downloads faster by utilizing more of your connection (assuming the server supports it), and I've checked that aria2c doesn't suffer from the same bug as curl.curl -I <URL> | grep Content-Length | cut -d' ' -f 2
to obtain the length of the file, and check that against your downloaded file size, before running curl
.If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With