Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling HTTP chunked encoding with django

I have a problem handeling http chunked transfer encoding.

I'm using:

  • apache.
  • mod_wsgi plugin.
  • django.

django, is only capable of handling reqular http request with content-length header field, but when it comes to handling TE (Transfer-Encoding), chunked or gzip, it returns an empty result.

I'm thinking of 2 approaches:

  1. Making some modification to django.wsgi python file
  2. Add some middleware python file to django, to intercept any chunked http request,convert it to requelar http request with content-length header field, then, pass it to django, where it can handle it nicely.

Anybody can help with any of the above 2 options (more options are most welcome of course)

Thanks!


This is an extention to my question after Graham's first anwer:

First of all, thanks for your quick response. The client being used is Axis, which is a part of another company's system communicating with ours. I had WSGIChunkedRequest On set, I also made some modifications to my wsgi wrapper like this:

def application(environ, start_response):

    if environ.get("mod_wsgi.input_chunked") == "1":
        stream = environ["wsgi.input"]
        print stream
        print 'type: ', type(stream)
        length = 0
        for byte in stream:
            length+=1
        #print length    
        environ["CONTENT_LENGTH"] = len(stream.read(length))

    django_application = get_wsgi_application()
    return django_application(environ, start_response)

but it gives me those errors (extracted from apache's error.log file):

[Sat Aug 25 17:26:07 2012] [error] <mod_wsgi.Input object at 0xb6c35390>
[Sat Aug 25 17:26:07 2012] [error] type:  <type 'mod_wsgi.Input'>
[Sat Aug 25 17:26:08 2012] [error] [client xxxxxxxxxxxxx] mod_wsgi (pid=27210): Exception occurred processing WSGI script '/..../wsgi.py'.
[Sat Aug 25 17:26:08 2012] [error] [client xxxxxxxxxxxxx] Traceback (most recent call last):
[Sat Aug 25 17:26:08 2012] [error] [client xxxxxxxxxxxxx]   File "/..../wsgi.py", line 57, in application
[Sat Aug 25 17:26:08 2012] [error] [client xxxxxxxxxxxxx]     for byte in stream:
[Sat Aug 25 17:26:08 2012] [error] [client xxxxxxxxxxxxx] IOError: request data read error

What Am I doing wrong?!

like image 880
securecurve Avatar asked Aug 23 '12 11:08

securecurve


2 Answers

This is a not a Django issue. It is a limitation of the WSGI specification itself in as much as the WSGI specification prohibits use of chunked request content by requiring a CONTENT_LENGTH value for request.

When using mod_wsgi there is a switch for enabling non standard support for chunked request content, but that means your application isn't WSGI compliant, plus it would require a custom web application or WSGI wrapper as it still isn't going to work with Django.

The option in mod_wsgi to allow chunked request content is:

WSGIChunkedRequest On

Your WSGI wrapper should call wsgi.input.read() to get whole content, created a StringIO instance with it and use that to replace wsgi.input and then also add a new CONTENT_LENGTH value to environ with actual length before calling wrapped application.

Do note this is dangerous because you will not know how much data is being sent.

What client are you using anyway that only supports chunked request content?


UPDATE 1

Your code is broken for numerous reasons. You should be using something like:

import StringIO

django_application = get_wsgi_application()

def application(environ, start_response):

    if environ.get("mod_wsgi.input_chunked") == "1":
        stream = environ["wsgi.input"]
        data = stream.read()   
        environ["CONTENT_LENGTH"] = str(len(data))
        environ["wsgi.input"] = StringIO.StringIO(data)

    return django_application(environ, start_response)

Note that this will not help with gzip'd request content. You would need an additional check for that to see when content encoding was compressed data and then do same as above. This is because when data is uncompressed by Apache the content length changes and you need to recalculate it.

like image 95
Graham Dumpleton Avatar answered Nov 02 '22 07:11

Graham Dumpleton


Now everything works smoothly, the problem was in the daemon mode, as it doesn't work with chunked http traffic, may be in mod_wsgi 4 -- as per Graham Dumpleton. So, if you have this problem switch mod_wsgi to embedded mode.

As a modification to the Graham's code in the wsgi wrapper, there are 2 options where you can read the stream buffered in an environment variable:

First one:

try:
    while True:
        data+= stream.next()
except:
    print 'Done with reading the stream ...'

Second one:

try:
   data+= stream.read()
except:
   print 'Done with reading the stream ...' 

the first code stub, was able to read the buffer in daemon mode but stopped somewhere, and the program didn't continue operational (which confused me a bit, as I expected to see it working nicely), while the other code stub, crashed with an IOError, and only worked in embedded mode.

One more thing to add, upgrading from 3.3 to 3.4 didn't solve the problem, so you have to swtich to embedded mode.

Those are my results and observations. If you have any comments, additions, or corrections, please don't hesitate.

Thanks!

like image 32
securecurve Avatar answered Nov 02 '22 05:11

securecurve