Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create and parse multipart HTTP requests in Python

I'm trying to write some python code which can create multipart mime http requests in the client, and then appropriately interpret then on the server. I have, I think, partially succeeded on the client end with this:

from email.mime.multipart import MIMEMultipart, MIMEBase
import httplib
h1 = httplib.HTTPConnection('localhost:8080')
msg = MIMEMultipart()
fp = open('myfile.zip', 'rb')
base = MIMEBase("application", "octet-stream")
base.set_payload(fp.read())
msg.attach(base)
h1.request("POST", "http://localhost:8080/server", msg.as_string())

The only problem with this is that the email library also includes the Content-Type and MIME-Version headers, and I'm not sure how they're going to be related to the HTTP headers included by httplib:

Content-Type: multipart/mixed; boundary="===============2050792481=="
MIME-Version: 1.0

--===============2050792481==
Content-Type: application/octet-stream
MIME-Version: 1.0

This may be the reason that when this request is received by my web.py application, I just get an error message. The web.py POST handler:

class MultipartServer:
    def POST(self, collection):
        print web.input()

Throws this error:

Traceback (most recent call last):
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 242, in process
    return self.handle()
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 233, in handle
    return self._delegate(fn, self.fvars, args)
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 415, in _delegate
    return handle_class(cls)
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/application.py", line 390, in handle_class
    return tocall(*args)
  File "/home/richard/Development/server/webservice.py", line 31, in POST
    print web.input()
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/webapi.py", line 279, in input
    return storify(out, *requireds, **defaults)
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 150, in storify
    value = getvalue(value)
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 139, in getvalue
    return unicodify(x)
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 130, in unicodify
    if _unicode and isinstance(s, str): return safeunicode(s)
  File "/usr/local/lib/python2.6/dist-packages/web.py-0.34-py2.6.egg/web/utils.py", line 326, in safeunicode
    return obj.decode(encoding)
  File "/usr/lib/python2.6/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 137-138: invalid data

My line of code is represented by the error line about half way down:

  File "/home/richard/Development/server/webservice.py", line 31, in POST
    print web.input()

It's coming along, but I'm not sure where to go from here. Is this a problem with my client code, or a limitation of web.py (perhaps it just can't support multipart requests)? Any hints or suggestions of alternative code libraries would be gratefully received.

EDIT

The error above was caused by the data not being automatically base64 encoded. Adding

encoders.encode_base64(base)

Gets rid of this error, and now the problem is clear. HTTP request isn't being interpreted correctly in the server, presumably because the email library is including what should be the HTTP headers in the body instead:

<Storage {'Content-Type: multipart/mixed': u'', 
          ' boundary': u'"===============1342637378=="\n'
          'MIME-Version: 1.0\n\n--===============1342637378==\n'
          'Content-Type: application/octet-stream\n'
          'MIME-Version: 1.0\n' 
          'Content-Transfer-Encoding: base64\n'
          '\n0fINCs PBk1jAAAAAAAAA.... etc

So something is not right there.

Thanks

Richard

like image 861
Richard J Avatar asked Dec 13 '10 22:12

Richard J


2 Answers

I used this package by Will Holcomb http://pypi.python.org/pypi/MultipartPostHandler/0.1.0 to make multi-part requests with urllib2, it may help you out.

like image 129
Justin Fay Avatar answered Oct 16 '22 19:10

Justin Fay


After a bit of exploration, the answer to this question has become clear. The short answer is that although the Content-Disposition is optional in a Mime-encoded message, web.py requires it for each mime-part in order to correctly parse out the HTTP request.

Contrary to other comments on this question, the difference between HTTP and Email is irrelevant, as they are simply transport mechanisms for the Mime message and nothing more. Multipart/related (not multipart/form-data) messages are common in content exchanging webservices, which is the use case here. The code snippets provided are accurate, though, and led me to a slightly briefer solution to the problem.

# open an HTTP connection
h1 = httplib.HTTPConnection('localhost:8080')

# create a mime multipart message of type multipart/related
msg = MIMEMultipart("related")

# create a mime-part containing a zip file, with a Content-Disposition header
# on the section
fp = open('file.zip', 'rb')
base = MIMEBase("application", "zip")
base['Content-Disposition'] = 'file; name="package"; filename="file.zip"'
base.set_payload(fp.read())
encoders.encode_base64(base)
msg.attach(base)

# Here's a rubbish bit: chomp through the header rows, until hitting a newline on
# its own, and read each string on the way as an HTTP header, and reading the rest
# of the message into a new variable
header_mode = True
headers = {}
body = []
for line in msg.as_string().splitlines(True):
    if line == "\n" and header_mode == True:
        header_mode = False
    if header_mode:
        (key, value) = line.split(":", 1)
        headers[key.strip()] = value.strip()
    else:
        body.append(line)
body = "".join(body)

# do the request, with the separated headers and body
h1.request("POST", "http://localhost:8080/server", body, headers)

This is picked up perfectly well by web.py, so it's clear that email.mime.multipart is suitable for creating Mime messages to be transported by HTTP, with the exception of its header handling.

My other overall conern is in scalability. Neither this solution nor the others proposed here scale well, as they read the contents of a file into a variable before bundling up in the mime message. A better solution would be one which could serialise on demand as the content is piped out over the HTTP connection. It's not urgent for me to fix that, but I'll come back here with a solution if I get to it.

like image 1
Richard J Avatar answered Oct 16 '22 19:10

Richard J