Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

WGET: Removing 'filename' since it should be rejected

Tags:

shell

wget

I am trying to download all the wmv files that have the word 'high' on their name, in a website using wget with the following command:

wget -A "*high*.wmv" -r -H -l1 -nd -np -erobots=off http://mywebsite.com -O yl-`date +%H%M%S`.wmv

The file starts and finishes downloading but just after it downloads I get

Removing yl-120058.wmv since it should be rejected.
  • Why is that and how could I avoid it?
  • How could I make the command to spider the whole website for those type of files automatically?
like image 819
Cy. Avatar asked Jun 30 '11 10:06

Cy.


1 Answers

It's because the accept list is being checked twice, once before downloading and once after saving. The latter is the behavior you see here ("it's not a bug, it's a feature"):

Your saved file yl-120058.wmv does not match your specified pattern -A "high.wmv" and will be thus rejected and deleted.

Quote from wget manual:

Finally, it's worth noting that the accept/reject lists are matched twice against downloaded files: [..] the local file's name is also checked against the accept/reject lists to see if it should be removed. [..] However, this can lead to unexpected results.

like image 141
EPSG31468 Avatar answered Sep 23 '22 07:09

EPSG31468