Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I download a single file from multiple locations via HTTP?

I need to download a big file quickly, but all sources I can find have throttled bandwidth. Each of them seem to support HTTP 1.1 Byte Serving (Range Requests), since I can pause and resume the downloads. How can I download it from multiple sources in parallel?

like image 288
Bengt Avatar asked Apr 13 '13 19:04

Bengt


2 Answers

Assuming this is a programming question (given that this is StackOverflow) I am going to explain how instead of just linking to a download accelerator that takes advantage of this.

What is needed in terms of the server to do this?

  • A server that supports Range HTTP header.
  • A server that allows for concurrent connections. It is possible to support Range while not allowing multiple simultaneous connection by using either endpoint or IP based restrictions server side. For this reason, I recommend you set up a simple test server instead of downloading from a file sharing site while testing this.

What is the Range Header?

Data transmission over HTTP is sent in order starting from the beginning of the file if the Range header is not set. The first byte of the file on the server will be the first byte of the HTTP response and the last byte of the file on the server will be the last byte of the HTTP response. The Range header allows you to specify where the bytes should start sending from allowing you to "skip" the beginning of the response.

Actual Answer Example

Our Situation

The response is plain text. The response content is just one word "StackOverflow!!" encoding ASCII, meaning each character is one byte. Therefore, the Content-Length header's value is 15 octets (another term for bytes).

We are going to download this file using 3 requests. For the sake of this example, we are going to say it will be 3 times faster but you should realize that this method will make downloads slower for very small files. This is because HTTP headers must be sent with each request as well as the 3-way handshake. We will also assume that the server supports HEAD requests and that the Content-Length header is sent with the download response. Finally, this request will be preformed using GET for reasons of HEAD requests. However, there are workarounds for POST.

Juicy Details

First, perform an HTTP HEAD request. Take the "Content-Length" header and divide that value by the amount of concurrent parallel connections you wish to make. For this example, the Content-Length is 15 and we wish to make 3 connections so the divided value will be 5.

Now preform the amount of requests you wished to preform parallel. With each request, set the Range header to "Range: bytes=" followe by how many requests have already been made times the divided value found above. Then append "-" followed by the value you just determined plus the divided value. For this example, each request should have the header set as followed.

  1. Range: bytes=0-5
  2. Range: bytes=5-10
  3. Range: bytes=10-15

The response of each of these requests should be

  1. Stack
  2. Overf
  3. low!!

In essence, we are just conforming to Range specification (section 3.12 of RFC 2616) as well as Byte Range specification (section 14.35 of RFC 2616).

Finally, append the bytes of each request to form the final response data.

Disclaimer: I've never actually tried this but it should work in theory

like image 64
Isaiah Turner Avatar answered Sep 25 '22 21:09

Isaiah Turner


I can't say if wget is able to put a file together again, if fetched from multiple sources.

The following example shows how to do it with aria2c.

You would build a download description file and then pass that to aria, like so:

aria2c -i uri.txt --split=5 --min-split-size=1M --max-connection-per-server=5

where uri.txt might contain

http://a.com/file1.iso http://mirror-1.com/file1.iso http://mirror-2.com/file1.iso
dir=/downloads
out=file1.iso

This would fetch the same file, from 3 different locations and place it into the downloads folder (dir) with the name file1.iso (out).

like image 28
Jens A. Koch Avatar answered Sep 25 '22 21:09

Jens A. Koch