I'm attempting to use wget to recursively grab only the .jpg files from a particular website, with a view to creating an amusing screensaver for myself. Not such a lofty goal really.
The problem is that the pictures are hosted elsewhere (mfrost.typepad.com), not on the main domain of the website (www.cuteoverload.com).
I have tried using "-D" to specified the allowed domains, but sadly no cute jpgs have been forthcoming. How could I alter the line below to make this work?
wget -r -l2 -np -w1 -D www.cuteoverload.com,mfrost.typepad.com -A.jpg -R.html.php.gif www.cuteoverload.com/
Thanks.
It supports various protocols such as HTTP, HTTPS, and FTP protocols and retrieval through HTTP proxies. Wget is non-interactive, meaning that it can work in the background while the user is not logged on to the system. A perfect tool for your shell scripts to grab files from HTTPS enabled website too.
You can pass the --no-proxy option to the wget command. This option tells wget not to use proxies, even if the appropriate `*_proxy’ environment variable is defined: This option tells wget not to use proxies, even if the appropriate `*_proxy’ environment variable is defined:
If you don’t want about checking the validity of the certificate just pass the option --no-check-certificate to the wget command-line: You can pass the --no-proxy option to the wget command. This option tells wget not to use proxies, even if the appropriate `*_proxy’ environment variable is defined:
An examination of wget's man page[1] says this about -D:
Set domains to be followed. domain-list is a comma-separated list of domains. Note that it does not turn on -H.
This advisory about -H looks interesting:
Enable spanning across hosts when doing recursive retrieving.
So you need merely to add the -H flag to your invocation.
(Having done this, looks like all the images are restricted to mfrost.typepad.com/cute_overload/images/2008/12/07 and mfrost.typepad.com/cute_overload/images/2008/12/08).
-- [1] Although wget's primary reference manual is in info format.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With