Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Browserless access to LinkedIn with Python

I'm writing a command-line application that accesses linkedin. I'm using the python-linkedin API.

Things work as I expected, but I have a really big gripe about the authentication process. Currently, I need to:

  1. Start my application and wait for it to print an authentication URL
  2. Go to that URL with my browser
  3. Give my blessing for the application and wait for it to redirect me to a URL
  4. Extract the access token from the URL
  5. Input that access token into my application
  6. Do what I need to do with linkedin

I don't like doing steps 2 to 5 manually so I would like to automate them. What I was thinking of doing was:

  • Use a headless client like mechanize to access the URL from step 1 above
  • Scrape the screen and give my blessing automatically (may be required to input username and password -- I know these, so it's OK)
  • Wait to be redirected and grab the redirection URL
  • Extract the token from the URL
  • PROFIT!

Question time:

  • Looking around, this guy right here on SO tried to do something similar but was told that it's impossible. Why?
  • Then, this guy here does it in Jython and HtmlUnit. Should be possible with straight Python and mechanize, right?
  • Finally, has anybody seen a solution with straight Python and mechanize (or any other headless browser alternative)? I don't want to reinvent the wheel, but will code it up if necessary.

EDIT:

Code to initialize tokens (using the approach of the accepted answer):

api = linkedin.LinkedIn(KEY, SECRET, RETURN_URL)
result = api.request_token()
if not result:
    print 'Initialization error:', api.get_error()
    return

print 'Go to URL:', api.get_authorize_url()
print 'Enter verifier: ',
verifier = sys.stdin.readline().strip()
if not result:
    print 'Initialization error:', api.get_error()
    return

result = api.access_token(verifier=verifier)
if not result:
    print 'Initialization error:', api.get_error()
    return

fin = open('tokens.pickle', 'w')
for t in (api._request_token, api._request_token_secret, 
        api._access_token, api._access_token_secret ):
    pickle.dump(t, fin)
fin.close()

print 'Initialization complete.'

Code to use tokens:

api = linkedin.LinkedIn(KEY, SECRET, RETURN_URL)

tokens = tokens_fname()
try:
    fin = open(tokens)
    api._request_token = pickle.load(fin)
    api._request_token_secret = pickle.load(fin)
    api._access_token = pickle.load(fin)
    api._access_token_secret = pickle.load(fin)
except IOError, ioe:
    print ioe
    print 'Please run `python init_tokens.py\' first'
    return

profiles = api.get_search({ 'name' : name })
like image 353
mpenkov Avatar asked Nov 21 '11 14:11

mpenkov


2 Answers

As you are planning on authorizing yourself just once, and then making calls to the API for your own information, I would just manually retrieve your access token rather than worrying about automating it.

The user access token generated by LinkedIn when you authorize a given application is permanent unless you specify otherwise on the authorization screen. All you need to do is to generate the authorization screen with your application, go through the process and upon success echo out and store your user access token (token and secret). Once you have that, you can hard code those into a file, database, etc and when making calls to the API, use those.

It's in PHP, but this demo does basically this. Just modify the demo.php script to echo out your token as needed.

like image 123
Unpossible Avatar answered Oct 29 '22 19:10

Unpossible


I have not tried it myself, but I believe in theory it should be possible with Selenium WebDriver with PyVirtualDisplay. This idea is described here.

like image 44
unutbu Avatar answered Oct 29 '22 18:10

unutbu