I'm experiencing some severe performances issues with prefetch_related
on a Model with 5 m2m fields and I'm pre-fetching also few nested m2m fields.
class TaskModelManager(models.Manager):
def get_queryset(self):
return super(TaskModelManager, self).get_queryset().exclude(internalStatus=2).prefetch_related("parent", "takes", "takes__flags", "assignedUser", "assignedUser__flags", "asset", "asset__flags", "status", "approvalWorkflow", "viewers", "requires", "linkedTasks", "activities")
class Task(models.Model):
uuid = models.UUIDField(primary_key=True, default=genOptimUUID, editable=False)
internalStatus = models.IntegerField(default=0)
parent = models.ForeignKey("self", blank=True, null=True, related_name="childs")
name = models.CharField(max_length=45)
taskType = models.ForeignKey("TaskType", null=True)
priority = models.IntegerField()
startDate = models.DateTimeField()
endDate = models.DateTimeField()
status = models.ForeignKey("ProgressionStatus")
assignedUser = models.ForeignKey("Asset", related_name="tasksAssigned")
asset = models.ForeignKey("Asset", related_name="tasksSubject")
viewers = models.ManyToManyField("Asset", blank=True, related_name="followedTasks")
step = models.ForeignKey("Step", blank=True, null=True, related_name="tasks")
approvalWorkflow = models.ForeignKey("ApprovalWorkflow")
linkedTasks = models.ManyToManyField("self", symmetrical=False, blank=True, related_name="linkedTo")
requires = models.ManyToManyField("self", symmetrical=False, blank=True, related_name="depends")
objects = TaskModelManager()
The number of query is fine and the database query time is fine too, for exemple if I query 700 objects of my model i have 35 query and the average query time is 100~200ms but the total request time is approximately 8 seconds.
silk times
I've run some profiling and it pointed out that more than 80% of the time spent was on the prefetch_related_objects
call.
profiling
I'm using Django==1.8.5
and djangorestframework==3.4.6
I'm open to any way to optimize this. Thanks in advance for your help.
Edit with select_related
:
I've tried the improvement proposed by Alasdair
class TaskModelManager(models.Manager):
def get_queryset(self):
return super(TaskModelManager, self).get_queryset().exclude(internalStatus=2).select_related("parent", "status", "approvalWorkflow", "step").prefetch_related("takes", "takes__flags", "assignedUser", "assignedUser__flags", "asset", "asset__flags", "viewers", "requires", "linkedTasks", "activities")
The new result is still 8 seconds for the request with 32 queries and 150ms of query time.
Edit :
It seems that a ticket was opened on Django issue tracker 4 years ago and is still open : https: //code.djangoproject.com/ticket/20577
I ran into the same problem.
Following the issue you linked i found that you can improve the prefetch_related
performance using Prefetch
object and to_attr
argument.
From the commit that introduces the Prefetch
object:
When a Prefetch instance specifies a to_attr argument, the result is stored in a list rather than a QuerySet. This has the fortunate consequence of being significantly faster. The preformance improvement is due to the fact that we save the costly creation of a QuerySet instance.
So i significantly improved my code (from about 7 seconds to 0.88 seconds) simply by calling:
Foo.objects.filter(...).prefetch_related(Prefetch('bars', to_attr='bars_list'))
instead of
Foo.objects.filter(...).prefetch_related('bars')
Try using select_related
for foreign keys like parent
and ApprovalWorkflow
instead of prefetch_related
.
When you use select_related
, Django will fetch the models using a join, unlike prefetch_related
which causes an extra query. You might find that this improves performance.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With