There are dozens of posts about n+1 queries in nested relations in Django, but I can't seem to find the answer to my question. Here's the context:
The Models
class Book(models.Model):
title = models.CharField(max_length=255)
class Tag(models.Model):
book = models.ForeignKey('app.Book', on_delete=models.CASCADE, related_name='tags')
category = models.ForeignKey('app.TagCategory', on_delete=models.PROTECT)
page = models.PositiveIntegerField()
class TagCategory(models.Model):
title = models.CharField(max_length=255)
key = models.CharField(max_length=255)
A book has many tags, each tag belongs to a tag category.
The Serializers
class TagSerializer(serializers.ModelSerializer):
class Meta:
model = Tag
exclude = ['id', 'book']
class BookSerializer(serializers.ModelSerializer):
tags = TagSerializer(many=True, required=False)
class Meta:
model = Book
fields = ['title', 'tags']
def create(self, validated_data):
with transaction.atomic():
tags = validated_data.pop('tags')
book = Book.objects.create(**validated_data)
Tag.objects.bulk_create([Tag(book=book, **tag) for tag in tags])
return book
The Problem
I am trying to POST to the BookViewSet
with the following example data:
{
"title": "The Jungle Book"
"tags": [
{ "page": 1, "category": 36 }, // plot intro
{ "page": 2, "category": 37 }, // character intro
{ "page": 4, "category": 37 }, // character intro
// ... up to 1000 tags
]
}
This all works, however, during the post, the serializer proceeds to make a call for each tag to check if the category_id
is a valid one:
With up to 1000 nested tags in a call, I can't afford this.
How do I "prefetch" for the validation?
If this is impossible, how do I turn off the validation that checks if a foreign_key id is in the database?
EDIT: Additional Info
Here is the view:
class BookViewSet(views.APIView):
queryset = Book.objects.all().select_related('tags', 'tags__category')
permission_classes = [IsAdminUser]
def post(self, request, format=None):
serializer = BookSerializer(data=request.data)
if serializer.is_valid():
serializer.save()
return Response(serializer.data, status=status.HTTP_201_CREATED)
return Response(serializer.errors, status=status.HTTP_400_BAD_REQUEST)
The DRF serializer is not the place (in my own opinion) to optimize a DB query. Serializer has 2 jobs:
Therefore the correct place to optimize your query is the corresponding view.
We will use the select_related
method that:
Returns a QuerySet that will “follow” foreign-key relationships, selecting additional related-object data when it executes its query. This is a performance booster which results in a single more complex query but means later use of foreign-key relationships won’t require database queries. to avoid the N+1 database queries.
You will need to modify the part of your view code that creates the corresponding queryset, in order to include a select_related
call.
You will also need to add a related_name
to the Tag.category
field definition.
Example:
# In your Tag model:
category = models.ForeignKey(
'app.TagCategory', on_delete=models.PROTECT, related_name='categories'
)
# In your queryset defining part of your View:
class BookViewSet(views.APIView):
queryset = Book.objects.all().select_related(
'tags', 'tags__categories'
) # We are using the related_name of the ForeignKey relationships.
If you want to test something different that uses also the serializer to cut down the number of queries, you can check this article.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With