Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Multiple simultaneous downloads using Wget?

Tags:

download

wget

use the aria2 :

aria2c -x 16 [url]
#          |
#          |
#          |
#          ----> the number of connections 

http://aria2.sourceforge.net

I love it !!


Wget does not support multiple socket connections in order to speed up download of files.

I think we can do a bit better than gmarian answer.

The correct way is to use aria2.

aria2c -x 16 -s 16 [url]
#          |    |
#          |    |
#          |    |
#          ---------> the number of connections here

Official documentation:

-x, --max-connection-per-server=NUM: The maximum number of connections to one server for each download. Possible Values: 1-16 Default: 1

-s, --split=N: Download a file using N connections. If more than N URIs are given, first N URIs are used and remaining URLs are used for backup. If less than N URIs are given, those URLs are used more than once so that N connections total are made simultaneously. The number of connections to the same host is restricted by the --max-connection-per-server option. See also the --min-split-size option. Possible Values: 1-* Default: 5


Since GNU parallel was not mentioned yet, let me give another way:

cat url.list | parallel -j 8 wget -O {#}.html {}

I found (probably) a solution

In the process of downloading a few thousand log files from one server to the next I suddenly had the need to do some serious multithreaded downloading in BSD, preferably with Wget as that was the simplest way I could think of handling this. A little looking around led me to this little nugget:

wget -r -np -N [url] &
wget -r -np -N [url] &
wget -r -np -N [url] &
wget -r -np -N [url]

Just repeat the wget -r -np -N [url] for as many threads as you need... Now given this isn’t pretty and there are surely better ways to do this but if you want something quick and dirty it should do the trick...

Note: the option -N makes wget download only "newer" files, which means it won't overwrite or re-download files unless their timestamp changes on the server.


Another program that can do this is axel.

axel -n <NUMBER_OF_CONNECTIONS> URL

For baisic HTTP Auth,

axel -n <NUMBER_OF_CONNECTIONS> "user:password@https://domain.tld/path/file.ext"

Ubuntu man page.