I am currrently using the following command to retrieve data from a site:
wget http://www.example.com --user=joe --password=schmoe --auth-no-challenge
I expand this to be recursive, however, my understanding is that this will resend the HTTP Auth credentials on each request.
Hence, is it possible to run the Basic HTTP Auth once, capture the cookies, and then trigger a recursive load with those cookies?
This does not appear to work:
wget --save-cookies=cookies.txt --user=joe --password=schmoe --auth-no-challenge http://www.example.com
Followed by:
wget --load-cookies=cookies.txt -r -p http://www.example.com/pages.html
The HTTP Basic authentication scheme is not a persistent, cookie-based authentication scheme, like say a Bearer scheme (e.g. Oauth2), so the credentials will need to be passed on all subsequent requests. The exception would be at the "application" layer if the browser caches the credentials, but that is a browser convenience construct (one of which there is minimal control over) and wouldn't apply in this situation with wget
.
Here is a good summary of the drawbacks of HTTP Basic, including the fact that credentials need to be sent with every request.
Check out the Hypertext Transfer Protocol (HTTP) Authentication Scheme Registry for a comprehensive list of authentication schemes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With