Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django request data returns str instead of list

I'm developing a REST api with Django and REST-framework. I have endpoint which takes a POST request with this kind of json:

{
        "pipeline": ["Bayes"],
        "material": [
            "Rakastan iloisuutta!",
            "Autojen kanssa pitää olla varovainen.",
            "Paska kesä taas. Kylmää ja sataa"
        ]
    }

It is a machine-learning analysis api and the json tells to use Bayes classifier to provided strings and return results. This works fine when I test it manually by doing the post requests. However, it breaks down when I try to write an unit test. I have the following test:

class ClassifyTextAPITests(APITestCase):
    fixtures = ['fixtures/analyzerfixtures.json'] #suboptimal fixture since requires bayes.pkl in /assets/classifiers folder

    def test_classification(self):
        """ Make sure that the API will respond correctly when required url params are supplied.
        """
        response = self.client.post(reverse('analyzer_api:classifytext'), {
            "pipeline": ["Bayes"],
            "material": [
                "Rakastan iloisuutta!",
                "Autojen kanssa pitää olla varovainen.",
                "Paska kesä taas. Kylmää ja sataa",
            ]
        })
        self.assertTrue(status.is_success(response.status_code))
        self.assertEqual(response.data[0], 1)

test fails everytime because of the latter assert gives "AssertionError: 'P' != 1"

Here is my view code:

class ClassifyText(APIView):
    """
    Takes text snippet as a parameter and returns analyzed result.
    """
    authentication_classes = (authentication.TokenAuthentication,)
    permission_classes = (permissions.AllowAny,)
    parser_classes = (JSONParser,)

    def post(self, request, format=None):
        try:
            self._validate_post_data(request.data)
            print("juttu", request.data["material"])
            #create pipeline from request
            pipeline = Pipeline()
            for component_name in request.data["pipeline"]:
                pipeline.add_component(component_name)

            response = pipeline.execute_pipeline(request.data['material'])
            status_code = status.HTTP_200_OK

        except Exception as e:
            response = {"message": "Please provide a proper data.",
                        "error": str(e) }
            status_code = status.HTTP_400_BAD_REQUEST

        return Response(response, status=status_code)

    def _validate_post_data(self, data):
        if "pipeline" not in data:
            raise InvalidRequest("Pipeline field is missing. Should be array of components used in analysis. Available components at /api/classifiers")

        if len(data["pipeline"]) < 1:
            raise InvalidRequest("Pipeline array is empty.")

        if "material" not in data:
            raise InvalidRequest("Material to be analyzed is missing. Please provide an array of strings.")

        if len(data["material"]) < 1:
            raise InvalidRequest("Material to be analyzed is missing, array is empty. Please provide an array of strings.")

The really interesting part was when I fired the debugger to check what happens here. Turns out that the line

request.data['material']

gives the last entry of the list in in my request, in this case

"Paska kesä taas. Kylmää ja sataa"

However, while I inspect the contents of the request.data, it shows a querydict with lists pipeline and material as they are in request. Why do I get string instead of material list when I call request.data["material"] ? Is there something I have forgotten and I have to specify some kind of serializer? And why it works during normal execution but not with tests?

I'm using Django 1.8 with Python 3. Also, I'm not tying the view to any specific model.

Finally here is what my debugger shows when I put break points into view: request.data:

QueryDict: {'material': ['Rakastan iloisuutta!', 'Autojen kanssa pitää olla varovainen.', 'Paska kesä taas. Kylmää ja sataa'], 'pipeline': ['Bayes']}

asd = request.data["material"]:

'Paska kesä taas. Kylmää ja sataa'
like image 480
Tumetsu Avatar asked Jul 13 '15 20:07

Tumetsu


Video Answer


3 Answers

This is because QueryDict returns the last value of a list in __getitem__:

QueryDict.getitem(key)

Returns the value for the given key. If the key has more than one value, getitem() returns the last value. Raises django.utils.datastructures.MultiValueDictKeyError if the key does not exist. (This is a subclass of Python’s standard KeyError, so you can stick to catching KeyError.)

https://docs.djangoproject.com/en/1.8/ref/request-response/#django.http.QueryDict.getitem

If you post a form, in which a key maps to a list:

d = {"a": 123, "b": [1,2,3]}
requests.post("http://127.0.0.1:6666", data=d)

this is what you get in the request body:

a=123&b=1&b=2&b=3

Since the test method post the data as a form, what you get from request.data is a QueryDict (the same as request.POST), hence you get the last value in the list when getting request.data.

To get expected behavior, post the data as JSON in the request body (as in @Vladir Parrado Cruz's answer).

like image 86
NeoWang Avatar answered Nov 10 '22 08:11

NeoWang


By default the QueryDict will return a single item from the list when doing a getitem call (or access by square brackets, such as you do in request.data['material'])

You can instead use the getlist method to return all values for the key:
https://docs.djangoproject.com/en/1.8/ref/request-response/#django.http.QueryDict.getlist

class ClassifyText(APIView):
    """
    Takes text snippet as a parameter and returns analyzed result.
    """
    authentication_classes = (authentication.TokenAuthentication,)
    permission_classes = (permissions.AllowAny,)
    parser_classes = (JSONParser,)

    def post(self, request, format=None):
        try:
            self._validate_post_data(request.data)
            print("juttu", request.data["material"])
            print("juttu", request.data.getlist("material"]))
            #create pipeline from request
            pipeline = Pipeline()
            for component_name in request.data["pipeline"]:
                pipeline.add_component(component_name)

            response = pipeline.execute_pipeline(request.data.getlist('material'))
            status_code = status.HTTP_200_OK

        except Exception as e:
            response = {"message": "Please provide a proper data.",
                        "error": str(e) }
            status_code = status.HTTP_400_BAD_REQUEST

        return Response(response, status=status_code)

    def _validate_post_data(self, data):
        if "pipeline" not in data:
            raise InvalidRequest("Pipeline field is missing. Should be array of components used in analysis. Available components at /api/classifiers")

        if len(data["pipeline"]) < 1:
            raise InvalidRequest("Pipeline array is empty.")

        if "material" not in data:
            raise InvalidRequest("Material to be analyzed is missing. Please provide an array of strings.")

        if len(data["material"]) < 1:
            raise InvalidRequest("Material to be analyzed is missing, array is empty. Please provide an array of strings.")
like image 43
Anentropic Avatar answered Nov 10 '22 10:11

Anentropic


Try to do something like that on the test:

import json

def test_classification(self):
    """ Make sure that the API will respond correctly when required url params are supplied.
    """
    response = self.client.post(
        reverse('analyzer_api:classifytext'),
        json.dumps({
            "pipeline": ["Bayes"],
            "material": [
                "Rakastan iloisuutta!",
                "Autojen kanssa pitää olla varovainen.",
                "Paska kesä taas. Kylmää ja sataa",
            ]
        }),
        content_type='application/json'
    )
    self.assertTrue(status.is_success(response.status_code))
    self.assertEqual(response.data[0], 1)

Perhaps if you send the data as json it will work.

like image 44
Vladir Parrado Cruz Avatar answered Nov 10 '22 09:11

Vladir Parrado Cruz