Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use proxy with Robobrowser

Tags:

python

django

I'm working with http://robobrowser.readthedocs.org/en/latest/readme.html, (a new python library based on the beautiful soup and request libraries) within django. My django app contains :

def index(request):    

    p=str(request.POST.get('p', False)) # p='https://www.yahoo.com/'

    pr="http://10.10.1.10:3128/"
    setProxy(pr)

    browser = RoboBrowser(history=True)
    postedmessage = browser.open(p)
    return HttpResponse(postedmessage)

I would like to add a proxy to my code but can't find a reference in the docs on how to do this. Is it possible to do this?

EDIT:

following your recommendation I've changed the code to

    pr="http://10.10.1.10:3128/"
    setProxy(pr)
    browser = RoboBrowser(history=True)

with:

def setProxy(pr):
    import os
    os.environ['HTTP_PROXY'] = pr
    return

I'm now getting:

Django Version: 1.6.4
Exception Type: LocationParseError
Exception Value:    
Failed to parse: Failed to parse: 10.10.1.10:3128

Any ideas on what to do next? I can't find a reference to this error

like image 800
user1592380 Avatar asked Dec 15 '22 22:12

user1592380


2 Answers

After some recent API cleanup in RoboBrowser, there are now two relatively straightforward ways to control proxies. First, you can configure proxies in your requests session, and then pass that session to your browser. This will apply your proxies to all requests made through the browser.

from requests import Session
from robobrowser import RoboBrowser

session = Session()
session.proxies = {'http': 'http://my.proxy.com/'}
browser = RoboBrowser(session=session)

Second, you can set proxies on a per-request basis. The open, follow_link, and submit_form methods of RoboBrowser now accept keyword arguments for requests.Session.send. For example:

browser.open('http://stackoverflow.com/', proxies={'http': 'http://your.proxy.com'})
like image 155
jm.carp Avatar answered Dec 26 '22 12:12

jm.carp


Since RoboBrowser uses the request library, you can try to set the proxies as mentioned in the request docs by setting the environment variables HTTP_PROXY and HTTPS_PROXY.

like image 37
arocks Avatar answered Dec 26 '22 12:12

arocks