Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django Rest Framework / Django performance issue

I am currently deploying Django w/ Django-Rest-Framework on my EC2 small instance server to provide a set of APIs for a couple of Android apps.

The problem is that I am facing a serious performance issue that I had to profile. I found out that most of the time for a single request is spent inside DRF's core.

Sorry for making this a long post but I think i have to show everything so that I can get a proper answer for whats going on. Let me go ahead:

My setup is nginx / uwsgi. Here's how I am running uwsgi using upstart:

description "pycms"
start on [2345]
stop on [06]

respawn

# start from virtualenv path
chdir /www/python/apps/pycms/

exec uwsgi -b 25000 --chdir=/www/python/apps/pycms --module=wsgi:application --env DJANGO_SETTINGS_MODULE=settings --socket=127.0.0.1:8081 --processes=5  --harakiri=20  --max-requests=5000  --vacuum --master --pidfile=/tmp/pycms-master.pid 

Assuming I request the following API:

http://IP_ADDRESS/api/nodes/mostviewed/9/

Which matches the following rule:

url(r'^nodes/mostviewed/(?P<category>\d+)/$', MostViewedNodesList.as_view(), name='mostviewed-nodes-list'),

Here's the class-based view:

class MostViewedNodesList(generics.ListAPIView):
    """
    API endpoint that lists featured nodes
    """
    model = ObjectStats
    serializer_class = NodeSerializer
    permission_classes = (permissions.AllowAny,)

    def get_queryset(self):
        if(self.kwargs.has_key('category')):
            category_id = self.kwargs.get('category')
            return ObjectStats.get_most_viewed(category_id)
        else:
            return []

The serializer class:

class NodeSerializer(serializers.ModelSerializer):
    images = ImageSerializer()
    favorite = ObjectField(source='is_favorite')
    rating = ObjectField(source='get_rating')
    meta = ObjectField(source='get_meta')
    url = ObjectField(source='get_absolute_url')
    channel_title = ObjectField(source='channel_title')

    class Meta:
        model = Node
        fields = ('id', 'title', 'body', 'images', 'parent', 'type', 'rating', 'meta', 'favorite', 'url', 'channel_title')

And finally the classmethod 'get_most_viewed' (I know it's wrong to use classmethods rather than manager method)

@classmethod
    def get_most_viewed(cls, category_id):
        return list(Node.objects.extra(
            where=['objects.id=content_api_objectviewsstats.node_id', 'content_api_objectviewsstats.term_id=%s'],
            params=[category_id],
            tables=['content_api_objectviewsstats'],
            order_by=['-content_api_objectviewsstats.num_views']
        ).prefetch_related('images', 'objectmeta_set').select_related('parent__parent'))

As you can see from all this, it's a normal request the is redirected to the designated view, fetching data from MySQL and then returning the serialized result. Nothing out of the ordinary or any complex processing.

Executing:

ab -c 500 -n 5000 http://IP_HERE/api/nodes/mostviewed/9/

Notice that this is without caching. The following is frequently logged:

HARAKIRI: --- uWSGI worker 5 (pid: 31015) WAS managing request /api/nodes/mostviewed/9/ since Sat Feb 16 13:07:21 2013 ---
DAMN ! worker 2 (pid: 31006) died, killed by signal 9 :( trying respawn ...
Respawned uWSGI worker 2 (new pid: 31040)

Load average during testing goes up to ~ 5. And here's the ab result:

Concurrency Level:      500
Time taken for tests:   251.810 seconds
Complete requests:      1969
Failed requests:        1771
   (Connect: 0, Receive: 0, Length: 1771, Exceptions: 0)
Write errors:           0
Non-2xx responses:      1967
Total transferred:      702612 bytes
HTML transferred:       396412 bytes
Requests per second:    7.82 [#/sec] (mean)
Time per request:       63943.511 [ms] (mean)
Time per request:       127.887 [ms] (mean, across all concurrent requests)
Transfer rate:          2.72 [Kbytes/sec] received

First of all, 7 requests per second is VERY disappointing. ~ 1700 failed requests due to timeout errors is also because of the performance lag I am facing here.

To be completely honest. I am expecting ~ 60 - 70 requests per second without caching. I know that caching speeds up the process but it also hides the performance issues I have which is why I am pursuing to solve this before I cache stuff.

I decided then to profile this on a vmware CentOS machine using django-profiling which by adding ?prof to the request shows call stack:

Instance wide RAM usage

Partition of a set of 373497 objects. Total size = 65340232 bytes.
 Index  Count   %     Size   % Cumulative  % Kind (class / dict of class)
     0   2270   1  7609040  12   7609040  12 dict of django.db.models.sql.query.Query
     1  19503   5  6263400  10  13872440  21 dict (no owner)
     2  63952  17  5739672   9  19612112  30 str
     3  51413  14  5090344   8  24702456  38 list
     4  58435  16  4991160   8  29693616  45 tuple
     5  24518   7  4434112   7  34127728  52 unicode
     6   8517   2  2384760   4  36512488  56 dict of django.db.models.base.ModelState
     7   2270   1  2378960   4  38891448  60 dict of django.db.models.query.QuerySet
     8   2268   1  2376864   4  41268312  63 dict of 0x14d6920
     9   6998   2  2088304   3  43356616  66 django.utils.datastructures.SortedDict
<619 more rows. Type e.g. '_.more' to view.>



CPU Time for this request

         663425 function calls (618735 primitive calls) in 2.037 CPU seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    2.037    2.037 /usr/lib/python2.6/site-packages/django/views/generic/base.py:44(view)
        1    0.000    0.000    2.037    2.037 /usr/lib/python2.6/site-packages/django/views/decorators/csrf.py:76(wrapped_view)
        1    0.000    0.000    2.037    2.037 /usr/lib/python2.6/site-packages/rest_framework/views.py:359(dispatch)
        1    0.000    0.000    2.036    2.036 /usr/lib/python2.6/site-packages/rest_framework/generics.py:144(get)
        1    0.000    0.000    2.036    2.036 /usr/lib/python2.6/site-packages/rest_framework/mixins.py:46(list)
        1    0.000    0.000    2.029    2.029 apps/content_api/views.py:504(get_queryset)
        1    0.000    0.000    2.029    2.029 apps/objects_stats/models.py:11(get_most_viewed)
    23/21    0.000    0.000    2.028    0.097 /usr/lib/python2.6/site-packages/django/db/models/query.py:92(__iter__)
      4/2    0.003    0.001    2.028    1.014 /usr/lib/python2.6/site-packages/django/db/models/query.py:77(__len__)
        1    0.000    0.000    1.645    1.645 /usr/lib/python2.6/site-packages/django/db/models/query.py:568(_prefetch_related_objects)
        1    0.002    0.002    1.645    1.645 /usr/lib/python2.6/site-packages/django/db/models/query.py:1596(prefetch_related_objects)
        2    0.024    0.012    1.643    0.822 /usr/lib/python2.6/site-packages/django/db/models/query.py:1748(prefetch_one_level)
     2288    0.007    0.000    1.156    0.001 /usr/lib/python2.6/site-packages/django/db/models/manager.py:115(all)
     6252    0.019    0.000    0.762    0.000 /usr/lib/python2.6/site-packages/django/db/models/query.py:231(iterator)
     4544    0.025    0.000    0.727    0.000 /usr/lib/python2.6/site-packages/django/db/models/query.py:870(_clone)
     4544    0.109    0.000    0.694    0.000 /usr/lib/python2.6/site-packages/django/db/models/sql/query.py:235(clone)
     2270    0.004    0.000    0.619    0.000 /usr/lib/python2.6/site-packages/django/db/models/query.py:619(filter)
     2270    0.013    0.000    0.615    0.000 /usr/lib/python2.6/site-packages/django/db/models/query.py:633(_filter_or_exclude)
     1144    0.019    0.000    0.581    0.001 /usr/lib/python2.6/site-packages/django/db/models/fields/related.py:456(get_query_set)
     1144    0.019    0.000    0.568    0.000 /usr/lib/python2.6/site-packages/django/db/models/fields/related.py:560(get_query_set)
55917/18180    0.192    0.000    0.500    0.000 /usr/lib64/python2.6/copy.py:144(deepcopy)
     2270    0.003    0.000    0.401    0.000 /usr/lib/python2.6/site-packages/django/db/models/query.py:820(using)

And as you can see, most of the time is spent on django views & DRF views.

Can someone point out if I am doing anything wrong here? Why are requests so slow? Does python / Django scale? I read that it does but how many requests / second should I expect on a simple DB read & rendering operation such as the one I am doing?

like image 469
Maverick Avatar asked Feb 16 '13 11:02

Maverick


1 Answers

does python / django scale

Django powers some fairly massive services, such as Disqus and Instagram, so yeah, it scales just fine.

I decided then to profile this

As you can see almost the entire time is spent inside the .get_most_viewed() method, so that looks like the issue you've got. (could be wrong there, but your profile suggests that 2.028 out of 2.037 of the time is spent there, so about 99.6% of the time)

My ORM skills aren't really up to figuring out exactly how you ought to be dealing with that fairly complex query, and would in any case need to see the model definitions, but you need to look into debugging and simplifying that query, rather than looking at other parts of the view.

You might want to drop into the Django shell using manage.py shell, so that you can profile that particular query in exclusion to the rest of view.

Also maybe try getting some help on how you can simplify the query here, or on the Django IRC channel or django mailing list (You'll likely have more luck if you ask specifically about that query, rather than the more general Django/REST framework phrased question here, most of which doesn't actually appear to be relevant to the issue your seeing.

Hope that helps point you in the right direction for getting this resolved.

like image 135
Tom Christie Avatar answered Oct 12 '22 13:10

Tom Christie