Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to authenticate a site with Python using urllib2?

After much reading here on Stackoverflow as well as the web I'm still struggling with getting things to work.

My challenge: to get access to a restricted part of a website for which I'm a member using Python and urllib2.

From what I've read the code should be like this:

mgr = urllib2.HTTPPasswordMgrWithDefaultRealm()

url = 'http://www.domain.com'

mgr.add_password(None, url, 'username', 'password')
handler = urllib2.HTTPBasicAuthHandler(mgr)
opener = urllib2.build_opener(handler)

urllib2.install_opener(opener)

try:
    response = urllib2.urlopen('http://www.domain.com/restrictedpage')
    page = response.read()
    print page.geturl()
except IOError, e:
    print e

The print doesn't print "http://www.domain.com/restrictedpage", but shows "http://www.domain.com/login" so my credentials aren't stored/processed and I'm being redirected.

How can I get this to work? I've been trying for days and keep hitting the same dead ends. I've tried all the examples I could find to no avail.

My main question is: what's needed to authenticate to a website using Python and urllib2? Quick question: what am I doing wrong?

like image 207
Roland Avatar asked Oct 15 '25 19:10

Roland


1 Answers

Check first manually what is really happening when you are successfully authenticated (instructions with Chrome):

  • Open develper tools in Chrome (Ctrl + Shift + I)
  • Click Network tab
  • Go and do the authentication manually (go the the page, type user + passwd + submit)
  • check the POST method in the Network tab of the developer tools
  • check the Request Headers, Query String Parameters and Form Data. There you find all the information needed what you need to have in your own POST.

Then install "Advanced Rest Client (ARC)" Chrome extension

Use the ARC to construct a valid POST for authentication.

Now you know what to have in your headers and form data. Here's a sample code using Requests that worked for me for one particular site:

import requests

USERNAME = 'user' # put correct usename here
PASSWORD = 'password' # put correct password here

LOGINURL = 'https://login.example.com/'
DATAURL = 'https://data.example.com/secure_data.html'

session = requests.session()

req_headers = {
    'Content-Type': 'application/x-www-form-urlencoded'
}

formdata = {
    'UserName': USERNAME,
    'Password': PASSWORD,
    'LoginButton' : 'Login'
}

# Authenticate
r = session.post(LOGINURL, data=formdata, headers=req_headers, allow_redirects=False)
print r.headers
print r.status_code
print r.text

# Read data
r2 = session.get(DATAURL)
print "___________DATA____________"
print r2.headers
print r2.status_code
print r2.text
like image 69
samuel5 Avatar answered Oct 17 '25 07:10

samuel5



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!