How to request pages from website that uses OpenID?

Tags:

This question has been asked here before. The accepted answer was probably obvious to both questioner and answerer---but not to me. I have commented on the above question to get more precisions, but there was no response. I also approached the meta Q&A for help on how to bring back questions from their grave, and got no answer either.

The answer to the here above question was:

From the client's perspective, an OpenID login is very similar to any other web-based login. There isn't a defined protocol for the client; it is an ordinary web session that varies based on your OpenID provider. For this reason, I doubt that any such libraries exist. You will probably have to code it yourself.

I know how to log onto a website with Python already, using the Urllib2 module. But that's not enough for me to guess how to authenticate to an OpenID.

I'm actually trying to get my StackOverflow inbox in json format, for which I need to be logged in.

Could someone provide a short intro or a link to a nice tutorial on how to do that?

337

asked Sep 01 '11 17:09

neydroydrec

2 Answers

Well I myself don't know much about OpenID but your post (and the bounty!!) got me interested.

This link tells the exact flow of OpenID authentication sequence (Atleast for v1.0. The new version is 2.0). From what I could make out, the steps would be something like

You fetch the login page of stackoverflow that will also provide an option to login using OpenID (As a form field).
You send ur openID which is actually a form of uri and NOT username/email(If it is Google profile it is your profile ID)
Stackoverflow will then connect to your ID provider (in this case google) and send you a redirect to google login page and another link to where you should redirect later (lets say a)
You can login to the google provided page conventionally (using POST method from Python)
Google provides a cryptographic token (Not pretty sure about this step) in return to your login request
You send the new request to a with this token.
Stackoverflow will contact google with this token. If authenticity established, it will return a session ID
Later requests to STackOverflow should have this session ID
No idea about logging out!!

This link tells about various responses in OpenID and what they mean. So maybe it will come in handy when your code your client.

Links from the wiki page OpenID Explained

EDIT: Using Tamper Data Add on for Firefox, the following sequence of events can be constructed.

User sends a request to the SO login page. On entering the openID in the form field the resulting page sends a 302 redirecting to a google page. The redirect URL has a lot of OpenID parameters (which are for the google server). One of them is return_to=https://stackoverflow.com/users/authenticate/?s=some_value.
The user is presented with the google login page. On login there are a few 302's which redirect the user around in google realm.
Finally a 302 is received which redirects user to stackoverflow's page specified in 'return_to' earlier
During this entire series of operation a lot of cookie's have been generated which must be stored correctly
On accessing the SO page (which was 302'd by google), the SO server processes your request and in the response header sends a field "Set-Cookie" to set cookies named gauth and usr with a value along with another 302 to stackoverflow.com. This step completes your login
Your client simply stores the cookie usr
You are logged in as long as you remeber to send the Cookie usr with any request to SO.
You can now request your inbox just remeber to send the usr cookie with the request.

I suggest you start coding your python client and study the responses carefully. In most cases it will be a series of 302's with minimal user intervention (except for filling out your Google username and password and allowing the site page).

However to make it easier, you could just login to SO using your browser, copy all the cookie values and make a request using urllib2 with the cookie values set.

Of course in case you log out on the browser, you will have to login again and change the cookie value in your python program.

answered Oct 17 '22 14:10

RedBaron

I know this is close to archeology, digging a post that's two years old, but I just wrote a new enhanced version of the code from the validated answer, so I thought it may be cool to share it here, as this question/answers has been a great help for me to implement that.

So, here's what's different:

it uses the new requests library that is an enhancement over urllib2 ;
it supports authenticating using google's and stackexchange's openid provider.
it is way shorter and simpler to read, though it has less printouts

here's the code:

#!/usr/bin/env python

import sys
import urllib
import requests
from BeautifulSoup import BeautifulSoup

def get_google_auth_session(username, password):
    session = requests.Session()
    google_accounts_url = 'http://accounts.google.com'
    authentication_url = 'https://accounts.google.com/ServiceLoginAuth'
    stack_overflow_url = 'http://stackoverflow.com/users/authenticate'

    r = session.get(google_accounts_url)
    dsh = BeautifulSoup(r.text).findAll(attrs={'name' : 'dsh'})[0].get('value').encode()
    auto = r.headers['X-Auto-Login']
    follow_up = urllib.unquote(urllib.unquote(auto)).split('continue=')[-1]
    galx = r.cookies['GALX']

    payload = {'continue' : follow_up,
               'followup' : follow_up,
               'dsh' : dsh,
               'GALX' : galx,
               'pstMsg' : 1,
               'dnConn' : 'https://accounts.youtube.com',
               'checkConnection' : '',
               'checkedDomains' : '',
               'timeStmp' : '',
               'secTok' : '',
               'Email' : username,
               'Passwd' : password,
               'signIn' : 'Sign in',
               'PersistentCookie' : 'yes',
               'rmShown' : 1}

    r = session.post(authentication_url, data=payload)

    if r.url != authentication_url: # XXX
        print "Logged in"
    else:
        print "login failed"
        sys.exit(1)

    payload = {'oauth_version' : '',
               'oauth_server' : '',
               'openid_username' : '',
               'openid_identifier' : ''}
    r = session.post(stack_overflow_url, data=payload)
    return session

def get_so_auth_session(email, password):
    session = requests.Session()
    r = session.get('http://stackoverflow.com/users/login')
    fkey = BeautifulSoup(r.text).findAll(attrs={'name' : 'fkey'})[0]['value']

    payload = {'openid_identifier': 'https://openid.stackexchange.com',
               'openid_username': '',
               'oauth_version': '',
               'oauth_server': '',
               'fkey': fkey,
               }
    r = session.post('http://stackoverflow.com/users/authenticate', allow_redirects=True, data=payload)
    fkey = BeautifulSoup(r.text).findAll(attrs={'name' : 'fkey'})[0]['value']
    session_name = BeautifulSoup(r.text).findAll(attrs={'name' : 'session'})[0]['value']

    payload = {'email': email,
               'password': password,
               'fkey': fkey,
               'session': session_name}

    r = session.post('https://openid.stackexchange.com/account/login/submit', data=payload)
    # check if url changed for error detection
    error = BeautifulSoup(r.text).findAll(attrs={'class' : 'error'})
    if len(error) != 0:
        print "ERROR:", error[0].text
        sys.exit(1)
    return session

if __name__ == "__main__":
    prov = raw_input('Choose your openid provider [1 for StackOverflow, 2 for Google]: ')
    name = raw_input('Enter your OpenID address: ')
    pswd = getpass('Enter your password: ')
    if '1' in prov:
        so = get_so_auth_session(name, pswd)
    elif '2' in prov:
        so = get_google_auth_session(name, pswd)
    else:
        print "Error no openid provider given"

    r = so.get('http://stackoverflow.com/inbox/genuwine')
    print r.json()

the code is also available as a github gist

HTH

answered Oct 17 '22 14:10

zmo

Related questions
                            
                                Retrieve the command line arguments of the Python interpreter
                            
                                Most efficient way to remove multiple substrings from string?
                            
                                Customize location of .so file generated by Cython
                            
                                How to cope with the performance of generating signed URLs for accessing private content via CloudFront?
                            
                                In locust How to get a response from one task and pass it to other task
                            
                                np.isnan on arrays of dtype "object"
                            
                                Difference between web-based and executable installers for Python 3 on Windows
                            
                                docker python custom module not found
                            
                                Connect MySQL with Python 3.6 [closed]
                            
                                Removing cached files after a pytest run
                            
                                Write to /tmp directory in aws lambda with python
                            
                                pandas rolling window & datetime indexes: What does `offset` mean?
                            
                                Tesseract OCR fails to detect varying font size and letters that are not horizontally aligned
                            
                                What is a chain in PyMC3?
                            
                                How to improve the performance of this data pipeline for my tensorflow model
                            
                                Inputs to eager execution function cannot be Keras symbolic tensors
                            
                                numpy 1D array: mask elements that repeat more than n times
                            
                                Python CGI returning an http status code, such as 403?
                            
                                Best way to install python packages locally for development
                            
                                What are the arguments to the types.CodeType() python call?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to request pages from website that uses OpenID?

Tags:

python

authentication

openid

urllib2

neydroydrec

People also ask

2 Answers

RedBaron

zmo

Recent Activity

Donate For Us