Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

List of parents objects and their children with fewer queries

I've got a Django view that I'm trying to optimise. It shows a list of parent objects on a page, along with their children. The child model has the foreign key back to the parent, so select_related doesn't seem to apply.

class Parent(models.Model):
    name = models.CharField(max_length=31)

class Child(models.Model):
    name = models.CharField(max_length=31)
    parent = models.ForeignKey(Parent)

A naive implementation uses n+1 queries, where n is the number of parent objects, ie. one query to fetch the parent list, then one query to fetch the children of each parent.

I've written a view that does the job in two queries - one to fetch the parent objects, another to fetch the related children, then some Python (that I'm far too embarrassed to post here) to put it all back together again.

Once I found myself importing the standard library's collections module I realised that I was probably doing it wrong. There is probably a much easier way, but I lack the Django experience to find it. Any pointers would be much appreciated!

like image 835
leon_matthews Avatar asked Oct 10 '12 07:10

leon_matthews


Video Answer


2 Answers

Add a related_name to the foreign key, then use the prefetch_related method which added to Django 1.4:

Returns a QuerySet that will automatically retrieve, in a single batch, related objects for each of the specified lookups.

This has a similar purpose to select_related, in that both are designed to stop the deluge of database queries that is caused by accessing related objects, but the strategy is quite different:

  • select_related works by creating a SQL join and including the fields of the related object in the SELECT statement. For this reason, select_related gets the related objects in the same database query. However, to avoid the much larger result set that would result from joining across a 'many' relationship, select_related is limited to single-valued relationships - foreign key and one-to-one.

  • prefetch_related, on the other hand, does a separate lookup for each relationship, and does the 'joining' in Python. This allows it to prefetch many-to-many and many-to-one objects, which cannot be done using select_related, in addition to the foreign key and one-to-one relationships that are supported by select_related. It also supports prefetching of GenericRelation and GenericForeignKey.

class Parent(models.Model):
    name = models.CharField(max_length=31)

class Child(models.Model):
    name = models.CharField(max_length=31)
    parent = models.ForeignKey(Parent, related_name='children') 


>>> Parent.objects.all().prefetch_related('children')

All the relevant children will be fetched in a single query, and used to make QuerySets that have a pre-filled cache of the relevant results. These QuerySets are then used in the self.children.all() calls.

Note 1 that, as always with QuerySets, any subsequent chained methods which imply a different database query will ignore previously cached results, and retrieve data using a fresh database query.

Note 2 that if you use iterator() to run the query, prefetch_related() calls will be ignored since these two optimizations do not make sense together.

like image 150
defuz Avatar answered Oct 18 '22 11:10

defuz


If you ever need to work with more than 2 levels at once, you can consider a different approach to storing trees in db using MPTT

In a nutshell, it adds data to your model which are updated during updates and allow a much more efficient retrieval.

like image 30
Zeograd Avatar answered Oct 18 '22 11:10

Zeograd