Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does WGET return 2 error messages before succeeding?

Tags:

shell

wget

I am using a script to pull down some XML data on a authentication required URL with WGET.

In doing so, my script produces the following output for each url accessed (IPs and hostnames changed to protect the guilty):

> Resolving host.name.com... 127.0.0.1
> Connecting to host.name.com|127.0.0.1|:80... connected.
> HTTP request sent, awaiting response... 401 Access denied
> Connecting to host.name.com|127.0.0.1|:80... connected.
> HTTP request sent, awaiting response... 401 Unauthorized
> Reusing existing connection to host.name.com:80.
> HTTP request sent, awaiting response... 200 OK

Why does WGET complain that accessing the URL fails twice before successfully connecting? Is there a way to shut it up, or get it to connect properly in the first attempt?

For reference, here's the line I am using to call WGET:

wget --http-user=USERNAME --password=PASSWORD -O file.xml http://host.name.com/file.xml
like image 670
Dinedal Avatar asked Jan 11 '10 18:01

Dinedal


People also ask

How do you find verbose output in wget?

-v --verbose Turn on verbose output, with all the available data. The default output is verbose. -nv --no-verbose Turn off verbose without being completely quiet (use -q for that), which means that error messages and basic information still get printed.

What does wget O do?

Use of ' -O ' is not intended to mean simply “use the name file instead of the one in the URL;” rather, it is analogous to shell redirection: ' wget -O file http://foo ' is intended to work like ' wget -O - http://foo > file '; file will be truncated immediately, and all downloaded content will be written there.


2 Answers

This appears to be by design. Following the advice of @Wayne Conrad, I added the -d switch and was able to observe the first attempt failing because NTLM was required, and the second attempt failing because the first NTLM attempt was only level 1, where a level 3 NTLM challenge-response was required. WGET finally provides the needed authentication at the third attempt.

WGET does get a cookie to prevent re-authenticating for the duration of the session, which would prevent this if the connection wasn't terminated between files. I would need to pass WGET a list of files for this to occur, however I am unable to because I do not know the file names in advance.

like image 81
Dinedal Avatar answered Sep 18 '22 04:09

Dinedal


You seem to have a new version of wget. After 1.10.2, wget will not send out authentication unless challenged by the server first. And that is why the first one is failing. The second is failing cause of the what you described.

You can reduce one of them by adding the parameter --auth-no-challenge. This sends out the first in "basic" which will fail and the second one will be sent in "digest" mode. Which should work.

like image 31
Basil Dsouza Avatar answered Sep 19 '22 04:09

Basil Dsouza