Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to access request body when using Django Rest Framework and avoid getting RawPostDataException

I need to get the raw content of POST request body (as a string) yet when I try to access request.body I'm getting an exception:

django.http.request.RawPostDataException:
You cannot access body after reading from request's data stream

I am aware that it is adviced to use request.data instead of request.body when using Django Rest Framework, yet for the purpose of validating digital signature I have to have the request body in a raw and "untouched" form, since this is what 3rd-party signed and what I need to validate.

Pseudocode:

3rd_party_sign(json_data + secret_key) != validate_sign(json.dumps(request.data) + secret_key)

3rd_party_sign(json_data + secret_key) == validate_sign(request.body + secret_key)
like image 284
Krzysiek Avatar asked Dec 05 '17 20:12

Krzysiek


2 Answers

I have found interesting topic on DRFs GitHub, but it does not fully cover the problem. I have investigated the case and came up with a neat solution. Surprisingly there was no such question on SO, so I decided to add it for public following the SO self-answer guidelines.

The key for understanding the problem and solution is how the HttpRequest.body (source) works:

@property
def body(self):
    if not hasattr(self, '_body'):
        if self._read_started:
            raise RawPostDataException("You cannot access body after reading from request's data stream")
        # (...)
        try:
            self._body = self.read()
        except IOError as e:
            raise UnreadablePostError(*e.args) from e
        self._stream = BytesIO(self._body)
    return self._body

When accessing body - if the self._body is already set its simply returned, otherwise the internal request stream is being read and assigned to _body: self._body = self.read(). Since then any further access to body falls back to return self._body. In addition before reading the internal request stream there is a if self._read_started check which raises an exception if "read has started".

The self._read_started flague is being set by the read() method (source):

def read(self, *args, **kwargs):
    self._read_started = True
    try:
        return self._stream.read(*args, **kwargs)
    except IOError as e:
        six.reraise(UnreadablePostError, ...)

Now it should be clear that the RawPostDataException will be raised after accessing the request.body if only the read() method has been called without assigning its result to requests self._body.

Now lets have a look at DRF JSONParser class (source):

class JSONParser(BaseParser):
    media_type = 'application/json'
    renderer_class = renderers.JSONRenderer

    def parse(self, stream, media_type=None, parser_context=None):
        parser_context = parser_context or {}
        encoding = parser_context.get('encoding', settings.DEFAULT_CHARSET)
        try:
            data = stream.read().decode(encoding)
            return json.loads(data)
        except ValueError as exc:
            raise ParseError('JSON parse error - %s' % six.text_type(exc))

(I have chosen slightly older version o DRF source, cause after May 2017 there have been some performance improvements that obscure the key line for understanding our problem)

Now it should be clear that the stream.read() call sets the _read_started flague and therefore it is impossible for the body property to access the stream once again (after the parser).

The solution

The "no request.body" approach is a DRF intention (I guess) so despite it is technically possible to enable access to request.body globally (via custom middleware) - it should NOT be done without deep understanding of all its consequences.

The access to the request.body property may be explicitly and locally granted in the following manner:

You need to define custom parser:

import json
from django.conf import settings
from rest_framework.exceptions import ParseError
from rest_framework import renderers
from rest_framework.parsers import BaseParser

class MyJSONParser(BaseParser):
    media_type = 'application/json'
    renderer_class = renderers.JSONRenderer

    def parse(self, stream, media_type=None, parser_context=None):
        parser_context = parser_context or {}
        encoding = parser_context.get('encoding', settings.DEFAULT_CHARSET)
        request = parser_context.get('request')
        try:
            data = stream.read().decode(encoding)
            setattr(request, 'raw_body', data) # setting a 'body' alike custom attr with raw POST content
            return json.loads(data)
        except ValueError as exc:
            raise ParseError('JSON parse error - %s' % six.text_type(exc))

Then it can be used when it is necessary to access raw request content:

@api_view(['POST'])
@parser_classes((MyJSONParser,))
def example_view(request, format=None):
    return Response({'received data': request.raw_body})

While request.body still remains globally inaccessible (as DRF authors intended).

like image 162
Krzysiek Avatar answered Sep 17 '22 12:09

Krzysiek


Its been a while since this question is asked, so I'm not sure if theres some differences with the framework at the time, but if anyone is searching for accessing the raw request body with recent versions, from the DRF docs on the parsers:

The set of valid parsers for a view is always defined as a list of classes. When request.data is accessed, REST framework will examine the Content-Type header on the incoming request, and determine which parser to use to parse the request content.

Meaning the parser is executed lazily when request.data is accessed. So the solutions can be quite simply to read the request.body, and cache it somewhere before accessing request.data. No need to write a custom parser.

def some_action(self, request):
  raw_body = request.body
  parsed_body = request.data['something']
  verify_signature(raw_body, request.data['key_or_something'])
like image 35
hndr Avatar answered Sep 20 '22 12:09

hndr