Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django REST Serializer doing N+1 database calls for multiple nested relationship, 3 levels

I have a situation where my model has a Foreign Key relationship:

# models.py
class Child(models.Model):
    parent = models.ForeignKey(Parent,)

class Parent(models.Model):
    pass

and my serializer:

class ParentSerializer(serializer.ModelSerializer):
    child = serializers.SerializerMethodField('get_children_ordered')

    def get_children_ordered(self, parent):
        queryset = Child.objects.filter(parent=parent).select_related('parent')
        serialized_data = ChildSerializer(queryset, many=True, read_only=True, context=self.context)
        return serialized_data.data

    class Meta:
        model = Parent

When I call Parent in my views for N number of Parents, Django does N number of database calls inside the serializer when it grabs the children. Is there any way to get ALL children for ALL Parents to minimize the number of database calls?

I've tried this but it doesn't seem to solve my issue:

class ParentList(generics.ListAPIView):

    def get_queryset(self):
        queryset = Parent.objects.prefetch_related('child')
        return queryset

    serializer_class = ParentSerializer
    permission_classes = (permissions.IsAuthenticated,)

EDIT

I've updated the code below to reflect Alex's feedback....which solves the N+1 for one nested relationship.

# serializer.py
class ParentSerializer(serializer.ModelSerializer):
    child = serializers.SerializerMethodField('get_children_ordered')

    def get_children_ordered(self, parent):
        # The all() call should hit the cache
        serialized_data = ChildSerializer(parent.child.all(), many=True, read_only=True, context=self.context)
        return serialized_data.data

    class Meta:
            model = Parent

# views.py
class ParentList(generics.ListAPIView):

    def get_queryset(self):
        children = Prefetch('child', queryset=Child.objects.select_related('parent'))
        queryset = Parent.objects.prefetch_related(children)
        return queryset

    serializer_class = ParentSerializer
    permission_classes = (permissions.IsAuthenticated,)

Now let's say I have one more model, which is a grandchild:

# models.py
class GrandChild(models.Model):
    parent = models.ForeignKey(Child,)

class Child(models.Model):
    parent = models.ForeignKey(Parent,)

class Parent(models.Model):
    pass

If i place the following in my views.py for the Parent queryset:

queryset = Parent.objects.prefetch_related(children, 'children__grandchildren')

It doesn't look like those grandchildren are being carried on into the ChildSerializer, and thus, again I'm running another N+1 issue. Any thoughts on this one?

EDIT 2

Perhaps this will provide clarity...Maybe the reason i am still running into N + 1 database calls, is because both my children and grandchildren classes are Polymorphic.... i.e.

# models.py
class GrandChild(PolymorphicModel):
    child = models.ForeignKey(Child,)

class GrandSon(GrandChild):
    pass

class GrandDaughter(GrandChild):
    pass

class Child(PolymorphicModel):
    parent = models.ForeignKey(Parent,)

class Son(Child):
    pass

class Daughter(Child):
    pass

class Parent(models.Model):
    pass

and my serializers look more like this:

# serializer.py
class ChildSerializer(serializer.ModelSerializer):
    grandchild = serializers.SerializerMethodField('get_children_ordered')

    def to_representation(self, value):
        if isinstance(value, Son):
            return SonSerializer(value, context=self.context).to_representation(value)
        if isinstance(value, Daughter):
            return DaughterSerializer(value, context=self.context).to_representation(value)

    class Meta:
        model = Child

class ParentSerializer(serializer.ModelSerializer):
    child = serializers.SerializerMethodField('get_children_ordered')

    def get_children_ordered(self, parent):
        queryset = Child.objects.filter(parent=parent).select_related('parent')
        serialized_data = ChildSerializer(queryset, many=True, read_only=True, context=self.context)
        return serialized_data.data

    class Meta:
        model = Parent

Plus the same for Grandaughter, Grandson, I'll spare you the details codewise, but i think you get the picture.

When i run my view for ParentList, and i monitor DB queries, I'm getting something along the lines of 1000s of queries, for only a handful of parents.

If i run the same code in the django shell, i can accomplish the same query at no more than 25 queries. I suspect maybe it has something to do with the fact that I'm using the django-polymorphic library? The reason being is that, there's a Child and GrandChild database table, in additions to each Son/Daughter, Grandson/Granddaughter table, for a total of 6 tables. across those objects. So my gut tells me i'm missing those polymorphic tables.

Or perhaps there's a more elegant solution for my daata model?

like image 718
Dominooch Avatar asked Jan 26 '16 18:01

Dominooch


People also ask

What is nested serializer in Django REST Framework?

DRF provides a Serializer class that gives you a powerful, generic way to control the output of your responses, as well as a ModelSerializer class that provides a useful shortcut for creating serializers that deal with model instances and querysets.

How do you pass extra context data to Serializers in Django REST Framework?

In function based views we can pass extra context to serializer with "context" parameter with a dictionary. To access the extra context data inside the serializer we can simply access it with "self. context". From example, to get "exclude_email_list" we just used code 'exclude_email_list = self.

Do we need Serializers in Django REST Framework?

Serializers in Django REST Framework are responsible for converting objects into data types understandable by javascript and front-end frameworks. Serializers also provide deserialization, allowing parsed data to be converted back into complex types, after first validating the incoming data.


1 Answers

As far as I remember, nested serializers have access to prefetched relations, just make sure you don't modify a queryset (i.e. use all()):

class ParentSerializer(serializer.ModelSerializer):
    child = serializers.SerializerMethodField('get_children_ordered')

    def get_children_ordered(self, parent):
        # The all() call should hit the cache
        serialized_data = ChildSerializer(parent.child.all(), many=True, read_only=True, context=self.context)
        return serialized_data.data

    class Meta:
            model = Parent


class ParentList(generics.ListAPIView):

    def get_queryset(self):
        children = Prefetch('child', queryset=Child.objects.select_related('parent'))
        queryset = Parent.objects.prefetch_related(children)
        return queryset

    serializer_class = ParentSerializer
    permission_classes = (permissions.IsAuthenticated,)             
like image 190
Alex Morozov Avatar answered Oct 06 '22 00:10

Alex Morozov