Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Accepting and Sending Cookies with Mechanize

I need to fill in a login form on a webpage that requires cookies and get some information about the resultant page. Since this needs to be done at very weird hours at night, I'd like to automate the process and am therefore using mechanize (any other suggestions are welcome - note that I have to run my script on a school server, on which I cannot install new software. Mechanize is pure python so I am able to get around this problem).

The problem is that the page that hosts the login form requires that I be able to accept and send cookies. Ideally, I'd like to be able to accept and send all cookies that I the server sends me, rather than hard-code my own cookies.

So, I set out to write my script with mechanize, but I seem to be handling cookies wrong. Since I can't find helpful documentation anywhere (please point it out if I'm blind), I am asking here.

Here is my mechanize script:

import mechanize as mech

br = mech.Browser()
br.set_handle_robots(False)
print "No Robots"
br.set_handle_redirect(True)
br.open("some internal uOttawa website")
br.select_form(nr=0)
br.form['j_username'] = 'my username'
print "Login: ************"
br.form['j_password'] = 'my password'
print "Password: ************"
response = br.submit()
print response.read()

This prints the following

No Robots
Login: ************
Password: ************

<html>
<body>
    <img src="/idp/images/uottawa-logo-dark.png" />
    <h3>ERROR</h3>
    <p>
        An error occurred while processing your request.  Please contact your helpdesk or
        user ID office for assistance.
    </p>
    <p>
       This service requires cookies.  Please ensure that they are enabled and try your 
       going back to your desired resource and trying to login again.
    </p>
    <p>
       Use of your browser's back button may cause specific errors that can be resolved by
       going back to your desired resource and trying to login again.
    </p>
        <p>
           If you think you were sent here in error,
           please contact technical support
        </p>       
</body>
</html>

This is indeed the page that I would get if I disabled cookies on my Chrome browser and attempted the same thing.

I've tried adding a cookie jar as follows, with no luck.

br = mech.Browser()
cj = cookielib.LWPCookieJar()
br.set_cookiejar(cj)

I took a look at multiple mechanize documentation sources. One of them mention

A common mistake is to use mechanize.urlopen(), and the .extract_cookies() and 
.add_cookie_header() methods on a cookie object themselves. 
If you use mechanize.urlopen() (or OpenerDirector.open()), 
the module handles extraction and adding of cookies by itself,
so you should not call .extract_cookies() or .add_cookie_header().

This seems to say that my first method should work, but it doesn't.

I'd appreciate any help with this - it's confusing, and there seems to be a severe lack of documentation.

like image 658
inspectorG4dget Avatar asked Nov 27 '22 10:11

inspectorG4dget


1 Answers

I came across the exact same message while authenticating a Shibboleth website with Mechanize, just because I made the same mistake than you. And it looks like I figured it out.

Short answer

The link you need to open is:

br.open("https://web30.uottawa.ca/Shibboleth.sso/Login?target=https://web30.uottawa.ca/hr/web/post-register")

Instead of:

br.open("https://idp.uottawa.ca/idp/login.jsp?actionUrl=%2Fidp%2FAuthn%2FUserPassword")

Why?

Shibboleth: Connect easily and securely to a variety of services with one simple login.

The Shibboleth login itself is useless if you don't tell him which service you want to login. Let's analyse the HTTP headers and compare the cookies you get for both queries.

1. Opening https://idp.uottawa.ca/idp/login.jsp?actionUrl=%2Fidp%2FAuthn%2FUserPassword

Cookie: JSESSIONID=C2D4A19B2994BFA287A328F71A281C49; _ga=GA1.2.1233451770.1401374115; arp_scroll_position=-1; tools-resize=tools-resize-small; lang-prev-page=en; __utma=251309913.1233451770.1401374115.1401375882.1401375882.1; __utmb=251309913.14.9.1401376471057; __utmz=251309913.1401375882.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided); lang=en

2. Opening https://web30.uottawa.ca/Shibboleth.sso/Login?target=https://web30.uottawa.ca/hr/web/post-register

Cookie: JSESSIONID=8D6BEA53823CC1C3045B2CE3B1D61DB0; _idp_authn_lc_key=fc18251e-e5aa-4f77-bb17-5e893d8d3a43; _ga=GA1.2.1233451770.1401374115; arp_scroll_position=-1; tools-resize=tools-resize-small; lang-prev-page=en; __utma=251309913.1233451770.1401374115.1401375882.1401375882.1; __utmb=251309913.16.9.1401378064938; __utmz=251309913.1401375882.1.1.utmcsr=google|utmccn=(organic)|utmcmd=organic|utmctr=(not%20provided); lang=en

What's the difference? You got one more cookie: _idp_authn_lc_key=1c21128c-2fd7-45d2-adac-df9db4d0a9ad;. I suppose it is the cookie saying "I want to login there".

During the authentication process, the IdP will set a cookie named _idp_authn_lc_key. This cookie contains only information necessary to identify the current authentication process (which usually spans multiple requests/responses) and is deleted after the authentication process completes.

Source: https://wiki.shibboleth.net/confluence/display/SHIB2/IdPCookieUsage


How did I find that link? I indeed digged the web and found that https://web30.uottawa.ca/hr/web/en/user/registration redirects to the login form with the following link:

<a href="https://web30.uottawa.ca/Shibboleth.sso/Login?target=https://web30.uottawa.ca/hr/web/post-register" 
   class="button standard"><span>Create your account using infoweb</span></a>

So that was not a problem with Mechanize, but more that Shibboleth is a little hard to understand at first glance. You will find more information on the Shibboleth authentification flow here.

like image 180
Stéphane Bruckert Avatar answered Dec 05 '22 11:12

Stéphane Bruckert