Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Wget doesn't download recursively after following a redirect

Here is the way I use wget:

wget --recursive --level=10 --convert-links btlregion.ru

This page redirects to this. When I run wget like above, it follows the redirect, but then only downloads that page - not all pages recursively.

I've already tried --max-redirects=1 and --domains=www.btlregion.ru and it doesn't work.

If I invoke wget directly on this page, the recursive download works.

like image 242
Dmitrii Mikhailov Avatar asked Nov 17 '13 11:11

Dmitrii Mikhailov


People also ask

What is Wget recursive?

GNU Wget is capable of traversing parts of the Web (or a single HTTP or FTP server), following links and directory structure. We refer to this as to recursive retrieval, or recursion.

How do I download multiple files using Wget?

Wget Download Multiple Files From a File To download multiple files at once, use the -i option with the location of the file that contains the list of URLs to be downloaded. Each URL needs to be added on a separate line as shown.

What can I use instead of Wget?

The best alternative is aria2, which is both free and Open Source. Other great apps like Wget are uGet, cURL, ArchiveBox and HTTPie for Terminal. Wget alternatives are mainly Download Managers but may also be Website Downloaders or HTTP Clients.

Can Wget download a directory?

Generally, we would like to get specific directories according to our needs. Fortunately, wget enables us to do so as well. We switch on the recursive download with the option –recursive (-r) in order to get the desired subdirectories.


1 Answers

You need to use --span-hosts (-H) with --domains:

wget --recursive --level=10 --convert-links -H \
--domains=www.btlregion.ru btlregion.ru

--span-hosts allows wget to follow links pointing to other domains, and --domains restricts this to only follow links to the listed domains, to avoid downloading the internet.

The option --domains will, somewhat contrary to intuition, only work together with -H. This is mentioned in the docs, but in a way that's hard to understand.

like image 138
sleske Avatar answered Sep 21 '22 13:09

sleske