Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fix Character encoding of webpage using python Mechanize

I am trying to submit a form on this page using Mechanize.

br.open("http://mspc.bii.a-star.edu.sg/tankp/run_depth.html")
#selecting form to fill
br.select_form(nr = 0)
#input for the form
br['pdb_id'] = '1atp'
req = br.submit()

This however gives the following error

mechanize._form.ParseError: expected name token at '<! INPUT PDB FILE>\n\t'

I figure this is because of some misplaced character encoding(ref). I would want to know how to fix this.

like image 677
fzn Avatar asked May 29 '15 07:05

fzn


1 Answers

Your problem are some broken HTML comment tags, leading to an invalid website which mechanize's parser can't read. But you can use the included BeautifulSoup parser instead, which works in my case (Python 2.7.9, mechanize 0.2.5):

#!/usr/bin/env python
#-*- coding: utf-8 -*-
import mechanize

br = mechanize.Browser(factory=mechanize.RobustFactory())
br.open('http://mspc.bii.a-star.edu.sg/tankp/run_depth.html')
br.select_form(nr=0)
br['pdb_id'] = '1atp'
response = br.submit()
like image 123
Squall Avatar answered Nov 07 '22 12:11

Squall