Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

POST method form in lxml raises TypeError with submit_form

I'm trying to submit a POST method form using lxml and I'm getting a TypeError. This is a minimal example that raises this Error:

>>> import lxml.html
>>> page = lxml.html.parse("http://www.webcom.com/html/tutor/forms/start.shtml")
>>> form = page.getroot().forms[0]
>>> form.fields['your_name'] = 'Morphit'
>>> result = lxml.html.parse(lxml.html.submit_form(form))
    Traceback (most recent call last):
          File "<stdin>", line 1, in <module>
            File "/usr/lib/python3.3/site-packages/lxml/html/__init__.py", line 887, in submit_form
              return open_http(form.method, url, values)
            File "/usr/lib/python3.3/site-packages/lxml/html/__init__.py", line 907, in open_http_urllib
              return urlopen(url, data)
            File "/usr/lib/python3.3/urllib/request.py", line 160, in urlopen
              return opener.open(url, data, timeout)
            File "/usr/lib/python3.3/urllib/request.py", line 471, in open
              req = meth(req)
            File "/usr/lib/python3.3/urllib/request.py", line 1183, in do_request_
              raise TypeError(msg)
          TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str.

I've found the exact error elsewhere online, but I haven't seen it generated from inside lxml like this. Does anyone know if this is a bug, or expected behaviour and how to work around it?

like image 972
Morphit Avatar asked Oct 31 '12 04:10

Morphit


1 Answers

From https://github.com/lxml/lxml/pull/122/files:

"In python3, urlopen expects a byte stream for the POST data. this patch encodes the data in utf-8 before transmission." In src/lxml/html/__init__.py, change line 918,

data = urlencode(values)

to

data = urlencode(values).encode('utf-8')
like image 168
Mark Pundurs Avatar answered Sep 28 '22 12:09

Mark Pundurs