I am using Django REST framework for my API and yesterday I wanted to see how it works for large data. I found this tutorial about how to profile your requests (written by Tom Christie) and I discovered that for 10.000 users, my request was taking an astonishing 2:20 minutes.
Most of the time was being spent on serializing the objects (around 65%) so I was wondering what can I do to speed things up ?
My user model is actually extending the default django model, so using .values() does not work, because I am not also getting the nested model (even though it is a LOT faster).
Any help would be greatly appreciated :)
Edit
I am already using .select_related() when retrieving my queryset, and it has improved my time, but only by a few seconds. The number of total queries is 10, so my problem is not with the database access.
Also, I am using .defer(), in order to avoid fields that I don't need in this request. That also provided a small improvement, but not enough.
Edit #2
Models
from django.contrib.auth.models import User
from django.db.models import OneToOneField
from django.db.models import ForeignKey
from userena.models import UserenaLanguageBaseProfile
from django_extensions.db.fields import CreationDateTimeField
from django_extensions.db.fields import ModificationDateTimeField
from mycompany.models import MyCompany
class UserProfile(UserenaLanguageBaseProfile):
user = OneToOneField(User, related_name='user_profile')
company = ForeignKey(MyCompany)
created = CreationDateTimeField(_('created'))
modified = ModificationDateTimeField(_('modified'))
Serializers
from django.contrib.auth.models import User
from rest_framework import serializers
from accounts.models import UserProfile
class UserSerializer(serializers.ModelSerializer):
last_login = serializers.ReadOnlyField()
date_joined = serializers.ReadOnlyField()
is_active = serializers.ReadOnlyField()
class Meta:
model = User
fields = (
'id',
'last_login',
'username',
'first_name',
'last_name',
'email',
'is_active',
'date_joined',
)
class UserProfileSerializer(serializers.ModelSerializer):
user = UserSerializer()
class Meta:
model = UserProfile
fields = (
'id',
'user',
'mugshot',
'language',
)
Views
class UserProfileList(generics.GenericAPIView,
mixins.ListModelMixin,
mixins.CreateModelMixin):
serializer_class = UserProfileSerializer
permission_classes = (UserPermissions, )
def get_queryset(self):
company = self.request.user.user_profile.company
return UserProfile.objects.select_related().filter(company=company)
@etag(etag_func=UserListKeyConstructor())
def get(self, request, *args, **kwargs):
return self.list(request, *args, **kwargs)
The Django REST Framework(DRF) is a framework for quickly building robust REST API's. However when fetching models with nested relationships we run into performance issues. DRF becomes slow. This isn't due to DRF itself, but rather due to the n+1 problem.
Packages: Django has numerous packages that enable reusability of code. It is a full-stack web development framework, unlike FastAPI, a minimalistic framework used for developing fast web applications. Performance: In performance, FastAPI is speed-oriented, next to Django, which is not very fast.
Django REST framework (DRF) is a powerful and flexible toolkit for building Web APIs. Its main benefit is that it makes serialization much easier. Django REST framework is based on Django's class-based views, so it's an excellent option if you're familiar with Django.
Almost always the performance issues come from N+1 queries. This is usually because you are referencing related models, and a single query per relationship per object is generated to get the information. You can improve this by using .select_related
and .prefetch_related
in your get_queryset
method, as described in my other Stack Overflow answer.
The same tips that Django provides on database optimization also applies to Django REST framework, so I would recommend looking into those as well.
The reason why you are seeing the performance issues during serialization is because that is when Django makes the queries to the database.
I know this is old and you probably solved your problem already ... but for anyone else making it to this article...
The problem is you're doing a blind
select_related()
with no parameters, which does absolutely nothing for your query. What you really need to do is
prefetch_related('user_profile')
Without getting into the details, select_related is for "to one" relationships, and prefetch_related is for "to many" relationships. In your case, you're using a reverse relationship which is a "to many" query.
Your other problem is that you weren't using the reverse relationship correctly. change your get_queryset() in your serializer to this and I think you'll have what you want:
def get_queryset(self):
return UserProfile.objects.prefetch_related('user_profile').all()
ModelSerializers are slow, you said it yourself. Here's some more information on why it happens and how to speed things up: https://hakibenita.com/django-rest-framework-slow
- In performance critical endpoints, use a "regular" serializer, or none at all.
- Serializer fields that are not used for writing or validation, should be read only.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With