Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ModelSerializer is extremely slow in Django REST framework

I am using Django REST framework for my API and yesterday I wanted to see how it works for large data. I found this tutorial about how to profile your requests (written by Tom Christie) and I discovered that for 10.000 users, my request was taking an astonishing 2:20 minutes.

Most of the time was being spent on serializing the objects (around 65%) so I was wondering what can I do to speed things up ?

My user model is actually extending the default django model, so using .values() does not work, because I am not also getting the nested model (even though it is a LOT faster).

Any help would be greatly appreciated :)

Edit

I am already using .select_related() when retrieving my queryset, and it has improved my time, but only by a few seconds. The number of total queries is 10, so my problem is not with the database access.

Also, I am using .defer(), in order to avoid fields that I don't need in this request. That also provided a small improvement, but not enough.

Edit #2

Models

from django.contrib.auth.models import User
from django.db.models import OneToOneField
from django.db.models import ForeignKey

from userena.models import UserenaLanguageBaseProfile
from django_extensions.db.fields import CreationDateTimeField
from django_extensions.db.fields import ModificationDateTimeField

from mycompany.models import MyCompany


class UserProfile(UserenaLanguageBaseProfile):
    user = OneToOneField(User, related_name='user_profile')
    company = ForeignKey(MyCompany)
    created = CreationDateTimeField(_('created'))
    modified = ModificationDateTimeField(_('modified'))

Serializers

from django.contrib.auth.models import User

from rest_framework import serializers

from accounts.models import UserProfile


class UserSerializer(serializers.ModelSerializer):
    last_login = serializers.ReadOnlyField()
    date_joined = serializers.ReadOnlyField()
    is_active = serializers.ReadOnlyField()

    class Meta:
        model = User
        fields = (
            'id',
            'last_login',
            'username',
            'first_name',
            'last_name',
            'email',
            'is_active',
            'date_joined',
        )


class UserProfileSerializer(serializers.ModelSerializer):
    user = UserSerializer()

    class Meta:
        model = UserProfile
        fields = (
            'id',
            'user',
            'mugshot',
            'language',
        )

Views

class UserProfileList(generics.GenericAPIView,
                      mixins.ListModelMixin,
                      mixins.CreateModelMixin):

    serializer_class = UserProfileSerializer
    permission_classes = (UserPermissions, )

    def get_queryset(self):
        company = self.request.user.user_profile.company
        return UserProfile.objects.select_related().filter(company=company)

    @etag(etag_func=UserListKeyConstructor())
    def get(self, request, *args, **kwargs):
        return self.list(request, *args, **kwargs)
like image 893
AdelaN Avatar asked Mar 12 '15 17:03

AdelaN


People also ask

Why is Django REST so slow?

The Django REST Framework(DRF) is a framework for quickly building robust REST API's. However when fetching models with nested relationships we run into performance issues. DRF becomes slow. This isn't due to DRF itself, but rather due to the n+1 problem.

Is Django REST fast?

Packages: Django has numerous packages that enable reusability of code. It is a full-stack web development framework, unlike FastAPI, a minimalistic framework used for developing fast web applications. Performance: In performance, FastAPI is speed-oriented, next to Django, which is not very fast.

Is Django GOOD FOR REST API?

Django REST framework (DRF) is a powerful and flexible toolkit for building Web APIs. Its main benefit is that it makes serialization much easier. Django REST framework is based on Django's class-based views, so it's an excellent option if you're familiar with Django.


3 Answers

Almost always the performance issues come from N+1 queries. This is usually because you are referencing related models, and a single query per relationship per object is generated to get the information. You can improve this by using .select_related and .prefetch_related in your get_queryset method, as described in my other Stack Overflow answer.

The same tips that Django provides on database optimization also applies to Django REST framework, so I would recommend looking into those as well.

The reason why you are seeing the performance issues during serialization is because that is when Django makes the queries to the database.

like image 119
Kevin Brown-Silva Avatar answered Oct 22 '22 03:10

Kevin Brown-Silva


I know this is old and you probably solved your problem already ... but for anyone else making it to this article...

The problem is you're doing a blind

select_related()

with no parameters, which does absolutely nothing for your query. What you really need to do is

prefetch_related('user_profile')

Without getting into the details, select_related is for "to one" relationships, and prefetch_related is for "to many" relationships. In your case, you're using a reverse relationship which is a "to many" query.

Your other problem is that you weren't using the reverse relationship correctly. change your get_queryset() in your serializer to this and I think you'll have what you want:

def get_queryset(self):
    return UserProfile.objects.prefetch_related('user_profile').all()
like image 36
jaredn3 Avatar answered Oct 22 '22 04:10

jaredn3


ModelSerializers are slow, you said it yourself. Here's some more information on why it happens and how to speed things up: https://hakibenita.com/django-rest-framework-slow

  • In performance critical endpoints, use a "regular" serializer, or none at all.
  • Serializer fields that are not used for writing or validation, should be read only.
like image 10
barrtin Avatar answered Oct 22 '22 02:10

barrtin