Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why I can log in amazon website using python mechanize, but not requests or urllib2

I can use the following piece of python code found from here to log into amazon.com:

import mechanize 

br = mechanize.Browser()  
br.set_handle_robots(False)  
br.addheaders = [("User-agent", "Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.13) Gecko/20101206 Ubuntu/10.10 (maverick) Firefox/3.6.13")]  

sign_in = br.open('https://www.amazon.com/gp/sign-in.html')  

br.select_form(name="sign-in")  
br["email"] = '[email protected]' 
br["password"] = 'test4test'
logged_in = br.submit() 

orders_html = br.open("https://www.amazon.com/gp/css/history/orders/view.html?orderFilter=year-%s&startAtIndex=1000" % 2013)

But following two pieces using requests module and urllib2 do not work.

import requests
import sys

username = "[email protected]"
password = "test4test"

login_data = ({
        'email' : fb_username,
        'password' : fb_password,
        'flex_password': 'true'})

url = 'https://www.amazon.com/gp/sign-in.html'

agent ={'User-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.57 Safari/537.1'}

session = requests.session(config={'verbose': sys.stderr}, headers = agent)

r = session.get('http://www.amazon.com')

r1 = session.post(url, data=login_data, cookies=r.cookies)

r2 = session.post("https://www.amazon.com/gp/css/history/orders/view.html?orderFilter=year-2013&startAtIndex=1000", cookies = r1.cookies)

#

import urllib2
import urllib
import cookielib

amazon_username = "[email protected]"
amazon_password = "test4test"
url = 'https://www.amazon.com/gp/sign-in.html'

cookie = cookielib.CookieJar()
login_data = urllib.urlencode({'email' : amazon_username, 'password' : amazon_password,})

opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookie))
opener.addheaders = [('User-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.57 Safari/537.1')]

opener.open('www.amazon.com')

response = opener.open(url, login_data)

response = opener.open("https://www.amazon.com/gp/css/history/orders/view.html?orderFilter=year-%s&startAtIndex=1000" % 2013, login_data)

What did I do wrong in posting the amazon log in form? This is the first time I post a form. Any help is appreciated.

I prefer to use urllib2 or requests because all my other code are using these two modules.

Moreover, can any body comment on the speed performance between mechanize, requests and urllib2, and other advantage of mechanize over the other two?

~~~~~~~~~~~New~~~~~~~~~~~~ Following C.C.'s instruction, I now can log in with urllib2. But when I try to do the same with requests, it still does not work. Can anyone give me a clue?

import requests
import sys

fb_username = "[email protected]"
fb_password = "xxxx"

login_data = ({
            'email' : fb_username,
            'password' : fb_password,
            'action': 'sign-in'})

url = 'https://www.amazon.com/gp/sign-in.html'

agent ={'User-agent', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.57 Safari/537.1'}

session = requests.session(config={'verbose': sys.stderr}, headers = agent)

r = session.get(url)

r1 = session.post('https://www.amazon.com/gp/flex/sign-in/select.html', data=login_data, cookies=r.cookies)

b = r1.text
like image 394
user2687585 Avatar asked Oct 22 '22 03:10

user2687585


1 Answers

Regarding your urllib2 approach, you are missing 2 things.

First, if you look at the source of sign-in.html, it shows that

<form name="sign-in" id="sign-in" action="/gp/flex/sign-in/select.html" method="POST">

Meaning the form should be submitted to select.html.

Second, besides email & password, you also need to select whether you are an existing user or not:

<input id="newCust" type="radio" name="action" value="new-user"...>
...
<input id="returningCust" type="radio" name="action" value="sign-in"...>

It should look something like this:

import cookielib
import urllib
import urllib2

amazon_username = ...
amazon_password = ...

login_data = urllib.urlencode({'action': 'sign-in',
                               'email': amazon_username,
                               'password': amazon_password,
                               })

cookie = cookielib.CookieJar()    
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cookie))
opener.addheaders = [('User-agent', ...)]

response = opener.open('https://www.amazon.com/gp/sign-in.html')
print(response.getcode())

response = opener.open('https://www.amazon.com/gp/flex/sign-in/select.html', login_data)
print(response.getcode())

response = opener.open("https://www.amazon.com/") # it should show that you are logged in
print(response.getcode())
like image 196
Christina Avatar answered Oct 24 '22 12:10

Christina