I've got a Django view that I'm trying to optimise. It shows a list of parent objects on a page, along with their children. The child model has the foreign key back to the parent, so select_related
doesn't seem to apply.
class Parent(models.Model):
name = models.CharField(max_length=31)
class Child(models.Model):
name = models.CharField(max_length=31)
parent = models.ForeignKey(Parent)
A naive implementation uses n+1 queries, where n is the number of parent objects, ie. one query to fetch the parent list, then one query to fetch the children of each parent.
I've written a view that does the job in two queries - one to fetch the parent objects, another to fetch the related children, then some Python (that I'm far too embarrassed to post here) to put it all back together again.
Once I found myself importing the standard library's collections
module I realised that I was probably doing it wrong. There is probably a much easier way, but I lack the Django experience to find it. Any pointers would be much appreciated!
Add a related_name
to the foreign key, then use the prefetch_related
method which added to Django 1.4:
Returns a
QuerySet
that will automatically retrieve, in a single batch, related objects for each of the specified lookups.This has a similar purpose to
select_related
, in that both are designed to stop the deluge of database queries that is caused by accessing related objects, but the strategy is quite different:
select_related
works by creating a SQL join and including the fields of the related object in theSELECT
statement. For this reason,select_related
gets the related objects in the same database query. However, to avoid the much larger result set that would result from joining across a 'many' relationship,select_related
is limited to single-valued relationships - foreign key and one-to-one.
prefetch_related
, on the other hand, does a separate lookup for each relationship, and does the 'joining' in Python. This allows it to prefetch many-to-many and many-to-one objects, which cannot be done usingselect_related
, in addition to the foreign key and one-to-one relationships that are supported byselect_related
. It also supports prefetching ofGenericRelation
andGenericForeignKey
.
class Parent(models.Model):
name = models.CharField(max_length=31)
class Child(models.Model):
name = models.CharField(max_length=31)
parent = models.ForeignKey(Parent, related_name='children')
>>> Parent.objects.all().prefetch_related('children')
All the relevant children will be fetched in a single query, and used
to make QuerySets that have a pre-filled cache of the relevant
results. These QuerySets are then used in the self.children.all()
calls.
Note 1 that, as always with QuerySets, any subsequent chained methods which imply a different database query will ignore previously cached results, and retrieve data using a fresh database query.
Note 2 that if you use
iterator()
to run the query,prefetch_related()
calls will be ignored since these two optimizations do not make sense together.
If you ever need to work with more than 2 levels at once, you can consider a different approach to storing trees in db using MPTT
In a nutshell, it adds data to your model which are updated during updates and allow a much more efficient retrieval.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With