I'm developing a REST api with Django and REST-framework. I have endpoint which takes a POST request with this kind of json:
{
"pipeline": ["Bayes"],
"material": [
"Rakastan iloisuutta!",
"Autojen kanssa pitää olla varovainen.",
"Paska kesä taas. Kylmää ja sataa"
]
}
It is a machine-learning analysis api and the json tells to use Bayes classifier to provided strings and return results. This works fine when I test it manually by doing the post requests. However, it breaks down when I try to write an unit test. I have the following test:
class ClassifyTextAPITests(APITestCase):
fixtures = ['fixtures/analyzerfixtures.json'] #suboptimal fixture since requires bayes.pkl in /assets/classifiers folder
def test_classification(self):
""" Make sure that the API will respond correctly when required url params are supplied.
"""
response = self.client.post(reverse('analyzer_api:classifytext'), {
"pipeline": ["Bayes"],
"material": [
"Rakastan iloisuutta!",
"Autojen kanssa pitää olla varovainen.",
"Paska kesä taas. Kylmää ja sataa",
]
})
self.assertTrue(status.is_success(response.status_code))
self.assertEqual(response.data[0], 1)
test fails everytime because of the latter assert gives "AssertionError: 'P' != 1"
Here is my view code:
class ClassifyText(APIView):
"""
Takes text snippet as a parameter and returns analyzed result.
"""
authentication_classes = (authentication.TokenAuthentication,)
permission_classes = (permissions.AllowAny,)
parser_classes = (JSONParser,)
def post(self, request, format=None):
try:
self._validate_post_data(request.data)
print("juttu", request.data["material"])
#create pipeline from request
pipeline = Pipeline()
for component_name in request.data["pipeline"]:
pipeline.add_component(component_name)
response = pipeline.execute_pipeline(request.data['material'])
status_code = status.HTTP_200_OK
except Exception as e:
response = {"message": "Please provide a proper data.",
"error": str(e) }
status_code = status.HTTP_400_BAD_REQUEST
return Response(response, status=status_code)
def _validate_post_data(self, data):
if "pipeline" not in data:
raise InvalidRequest("Pipeline field is missing. Should be array of components used in analysis. Available components at /api/classifiers")
if len(data["pipeline"]) < 1:
raise InvalidRequest("Pipeline array is empty.")
if "material" not in data:
raise InvalidRequest("Material to be analyzed is missing. Please provide an array of strings.")
if len(data["material"]) < 1:
raise InvalidRequest("Material to be analyzed is missing, array is empty. Please provide an array of strings.")
The really interesting part was when I fired the debugger to check what happens here. Turns out that the line
request.data['material']
gives the last entry of the list in in my request, in this case
"Paska kesä taas. Kylmää ja sataa"
However, while I inspect the contents of the request.data, it shows a querydict with lists pipeline and material as they are in request. Why do I get string instead of material list when I call request.data["material"] ? Is there something I have forgotten and I have to specify some kind of serializer? And why it works during normal execution but not with tests?
I'm using Django 1.8 with Python 3. Also, I'm not tying the view to any specific model.
Finally here is what my debugger shows when I put break points into view: request.data:
QueryDict: {'material': ['Rakastan iloisuutta!', 'Autojen kanssa pitää olla varovainen.', 'Paska kesä taas. Kylmää ja sataa'], 'pipeline': ['Bayes']}
asd = request.data["material"]:
'Paska kesä taas. Kylmää ja sataa'
This is because QueryDict returns the last value of a list in __getitem__
:
QueryDict.getitem(key)
Returns the value for the given key. If the key has more than one value, getitem() returns the last value. Raises django.utils.datastructures.MultiValueDictKeyError if the key does not exist. (This is a subclass of Python’s standard KeyError, so you can stick to catching KeyError.)
https://docs.djangoproject.com/en/1.8/ref/request-response/#django.http.QueryDict.getitem
If you post a form, in which a key maps to a list:
d = {"a": 123, "b": [1,2,3]}
requests.post("http://127.0.0.1:6666", data=d)
this is what you get in the request body:
a=123&b=1&b=2&b=3
Since the test method post the data as a form, what you get from request.data is a QueryDict (the same as request.POST), hence you get the last value in the list when getting request.data.
To get expected behavior, post the data as JSON in the request body (as in @Vladir Parrado Cruz's answer).
By default the QueryDict will return a single item from the list when doing a getitem
call (or access by square brackets, such as you do in request.data['material']
)
You can instead use the getlist
method to return all values for the key:
https://docs.djangoproject.com/en/1.8/ref/request-response/#django.http.QueryDict.getlist
class ClassifyText(APIView):
"""
Takes text snippet as a parameter and returns analyzed result.
"""
authentication_classes = (authentication.TokenAuthentication,)
permission_classes = (permissions.AllowAny,)
parser_classes = (JSONParser,)
def post(self, request, format=None):
try:
self._validate_post_data(request.data)
print("juttu", request.data["material"])
print("juttu", request.data.getlist("material"]))
#create pipeline from request
pipeline = Pipeline()
for component_name in request.data["pipeline"]:
pipeline.add_component(component_name)
response = pipeline.execute_pipeline(request.data.getlist('material'))
status_code = status.HTTP_200_OK
except Exception as e:
response = {"message": "Please provide a proper data.",
"error": str(e) }
status_code = status.HTTP_400_BAD_REQUEST
return Response(response, status=status_code)
def _validate_post_data(self, data):
if "pipeline" not in data:
raise InvalidRequest("Pipeline field is missing. Should be array of components used in analysis. Available components at /api/classifiers")
if len(data["pipeline"]) < 1:
raise InvalidRequest("Pipeline array is empty.")
if "material" not in data:
raise InvalidRequest("Material to be analyzed is missing. Please provide an array of strings.")
if len(data["material"]) < 1:
raise InvalidRequest("Material to be analyzed is missing, array is empty. Please provide an array of strings.")
Try to do something like that on the test:
import json
def test_classification(self):
""" Make sure that the API will respond correctly when required url params are supplied.
"""
response = self.client.post(
reverse('analyzer_api:classifytext'),
json.dumps({
"pipeline": ["Bayes"],
"material": [
"Rakastan iloisuutta!",
"Autojen kanssa pitää olla varovainen.",
"Paska kesä taas. Kylmää ja sataa",
]
}),
content_type='application/json'
)
self.assertTrue(status.is_success(response.status_code))
self.assertEqual(response.data[0], 1)
Perhaps if you send the data as json it will work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With