I am writing a bash script and using wget to retrieve some PDF files form a website. For example:
wget www.barb.co.uk/news/item-subscriber/id/213/index.html
But wget saves the file as index.html. If I am in a browser and enter that URL, it correctly downloads the file with it's real name - "BARB Bulletin 25 - December 10.pdf".
How can I get wget to do the same? Or is there another way I can find the real name of the file (from within a bash script)?
Downloading a file In order to download a file using Wget, type wget followed by the URL of the file that you wish to download. Wget will download the file in the given URL and save it in the current directory.
By default, downloaded file will be saved with the last name mentioned in the URL. To save file with a different name option O can be used. Syntax: wget -O <fileName><URL>
Explanation: The Content-Disposition header can be used by a server to suggest a filename for a downloaded file. By default, wget uses the last part of the URL as the filename, but you can override this with --content-disposition , which uses the server's suggested name.
To resume a wget download it's very straight forward. Open the terminal to the directory where you were downloading your file to and run wget with the -c flag to resume the download.
You can use the --content-disposition
option to make wget have a more sophisticated look into the headers of the HTTP response, which helps in most cases.
Example:
wget --content-disposition www.barb.co.uk/news/item-subscriber/id/213/index.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With