Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using wget to recursively fetch a directory with --no-parent

Tags:

shell

wget

root

I am trying to download all of the files in a directory using:

wget -r -N --no-parent -nH -P /media/karunakar --ftp-user=jsjd --ftp-password='hdshd' ftp://ftp.xyz.com/Suppliers/my/ORD20130908

but wget is fetching files from the parent directory, even though I specified --no-parent. I only want the files in ORD20130908.

like image 980
Karunakar Avatar asked Sep 25 '13 12:09

Karunakar


People also ask

How do I download an entire folder using wget?

When you use -r or –recursive option with wget, it will download all files & folders and recursively, without any filters. If you don't want to download specific files or folders, you exclude them using -R or –reject option, followed by the file or folder name to be excluded.

What is wget recursive?

Recursive retrieval of HTTP and HTML/CSS content is breadth-first. This means that Wget first downloads the requested document, then the documents linked from that document, then the documents linked by them, and so on.

How do you wget multiple files?

If you want to download multiple files at once, use the -i option followed by the path to a local or external file containing a list of the URLs to be downloaded. Each URL needs to be on a separate line. If you specify - as a filename, URLs will be read from the standard input.

What is Spider mode in wget?

The wget tool is essentially a spider that scrapes / leeches web pages but some web hosts may block these spiders with the robots. txt files. Also, wget will not follow links on web pages that use the rel=nofollow attribute. You can however force wget to ignore the robots.


1 Answers

You need to add a trailing slash to indicate the last item in the URL is a directory and not a file:

wget -r -N --no-parent -nH -P /media/karunakar --ftp-user=jsjd --ftp-password='hdshd' ftp://ftp.xyz.com/Suppliers/my/ORD20130908

wget -r -N --no-parent -nH -P /media/karunakar --ftp-user=jsjd --ftp-password='hdshd' ftp://ftp.xyz.com/Suppliers/my/ORD20130908/

From the documentation:

Note that, for HTTP (and HTTPS), the trailing slash is very important to ‘--no-parent’. HTTP has no concept of a “directory”—Wget relies on you to indicate what’s a directory and what isn’t. In ‘http://foo/bar/’, Wget will consider ‘bar’ to be a directory, while in ‘http://foo/bar’ (no trailing slash), ‘bar’ will be considered a filename (so ‘--no-parent’ would be meaningless, as its parent is ‘/’).

like image 144
Synetech Avatar answered Oct 28 '22 06:10

Synetech