Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python urllib2 automatic form filling and retrieval of results

I'm looking to be able to query a site for warranty information on a machine that this script would be running on. It should be able to fill out a form if needed ( like in the case of say HP's service site) and would then be able to retrieve the resulting web page.

I already have the bits in place to parse the resulting html that is reported back I'm just having trouble with what needs to be done in order to do a POST of data that needs to be put in the fields and then being able to retrieve the resulting page.

like image 600
tak Avatar asked Apr 14 '11 18:04

tak


2 Answers

If you absolutely need to use urllib2, the basic gist is this:

import urllib
import urllib2
url = 'http://whatever.foo/form.html'
form_data = {'field1': 'value1', 'field2': 'value2'}
params = urllib.urlencode(form_data)
response = urllib2.urlopen(url, params)
data = response.read()

If you send along POST data (the 2nd argument to urlopen()), the request method is automatically set to POST.

I suggest you do yourself a favor and use mechanize, a full-blown urllib2 replacement that acts exactly like a real browser. A lot of sites use hidden fields, cookies, and redirects, none of which urllib2 handles for you by default, where mechanize does.

Check out Emulating a browser in Python with mechanize for a good example.

like image 137
jathanism Avatar answered Sep 21 '22 14:09

jathanism


Using urllib and urllib2 together,

data = urllib.urlencode([('field1',val1), ('field2',val2)]) # list of two-element tuples
content = urllib2.urlopen('post-url', data)

content will give you the page source.

like image 32
gladysbixly Avatar answered Sep 25 '22 14:09

gladysbixly