Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Mechanize - Login

I am trying to login to a website and get data from it. I can't seem to get mechanize to work on the following site. I have provided the HTML below. Could someone please give me brief help of how I can log in and print the next page?

I have tried using mechanize and looping through br.forms(). I can see the form in that but I am having problems getting my username and password insert in and then hitting submit.

<div class="loginform" id="loginpage" style="width: 300px;">
<div class="loginformentries" style="overflow: hidden;">
<div class="clearfix">
<div class="loginformtitle">Sign-in to your account</div>
</div>
<div class="clearfix">
<div class="loginformlabel"><label for="USERID">Username:</label></div>
<div class="loginforminput"><input name="USERID" id="USERID" style="width: 150px;" type="text" value=""></div>
</div>
<div class="clearfix">
<div class="loginformlabel"><label for="PASSWDTXT">Password:</label></div>
<div class="loginforminput"><input name="PASSWDTXT" id="PASSWDTXT" style="width: 150px;" type="password" value=""></div>
</div>
<div class="clearfix">
<div class="loginformlabel"><label for="usertype">Select Role:</label></div>
<div class="loginforminput"><select name="usertype" id="usertype" style="width: 150px;"><option value="participant">Participant</option>
<option value="sponsor">Sponsor</option></select></div>
</div>
<div class="loginformsubmit" style="text-align: right;"><span class="button"><button class="buttoninsidebuttonclass" type="submit">Login</button></span></div>
</div>
<div class="loginformdescription">Both entries are case sensitive. If you fail to login <strong>five</strong> consecutive times your account could be disabled.</div>
</div>
</div>
</div>

I am trying something like this...

import mechanize

br = mechanize.Browser()

br.open("test")

br.select_form(name="loginform")
br["USERID"] = 'xxxxx'
br["PASSWDTXT"] = 'xxxxx'
br.submit()
print br.title()

But I don't know how to verify that I am on the next page

like image 947
Trying_hard Avatar asked Jan 17 '14 16:01

Trying_hard


1 Answers

All those divs should be wrapped in a form element. Look for that and find the name tag. This is the form you'll want to log in with. Then you can use the snippet below to get the cookies you'll to use for further browsing.

import cookielib 
import urllib2 
import mechanize 

# Browser 
br = mechanize.Browser() 

# Enable cookie support for urllib2 
cookiejar = cookielib.LWPCookieJar() 
br.set_cookiejar( cookiejar ) 

# Broser options 
br.set_handle_equiv( True ) 
br.set_handle_gzip( True ) 
br.set_handle_redirect( True ) 
br.set_handle_referer( True ) 
br.set_handle_robots( False ) 

# ?? 
br.set_handle_refresh( mechanize._http.HTTPRefreshProcessor(), max_time = 1 ) 

br.addheaders = [ ( 'User-agent', 'Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.0.1) Gecko/2008071615 Fedora/3.0.1-1.fc9 Firefox/3.0.1' ) ] 

# authenticate 
br.open( the/page/you/want/to/login ) 
br.select_form( name="the name of the form from above" ) 
# these two come from the code you posted
# where you would normally put in your username and password
br[ "USERID" ] = yourLogin
br[ "PASSWDTXT" ] = yourPassword
res = br.submit() 

print "Success!\n"

After this, your login cookies will be saved in the cookiejar. Then you can use the same br object to get any page you like, as such

url = br.open( page/needed/after/login ) 
returnPage = url.read() 

This will give you the HTML source of the page, which you can then parse any way you want.

like image 154
CDspace Avatar answered Sep 22 '22 09:09

CDspace