I have a legacy code which uses nested ORM query, which produces SQL SELECT query with JOIN, and conditions which also contains SELECT and JOIN. Execution of this query takes enormous time. By the way, when I execute this query in raw SQL, taken from Django_ORM_query.query
, it performs with reasonable time.
What are best practices for optimization in such cases?
Would the query perform faster if I will use ManyToMany
and ForeignKey
relations?
1. Using select_related() and prefetch_related() functions. Django ORM provides two common methods to avoid the N+1 issue, which are select_related and prefetch_related. The select_related method performs a simple JOIN operation of tables and can be used for foreign key or one-to-one model fields.
Django's ORM is fantastic. It's slow because it chooses to be convenient but if it needs to be fast it's just a few slight API calls away. If you're curious, check out the code on Github.
Using select_related() Django offers a QuerySet method called select_related() that allows you to retrieve related objects for one-to-many relationships. This translates to a single, more complex QuerySet, but you avoid additional queries when accessing the related objects.
Performance issue in Django is usually caused by following relations in a loop, which causes multiple database queries. If you have django-debug-toolbar installed, you can check for how many queries you're doing and figure out which query needs to be optimized. The debug toolbar also shows you the time of each queries, which is essential for optimizing django, you're missing out a lot if you didn't have it installed or didn't use it.
You'd generally solve the problem of following relations by using select_related() or prefetch_related().
A page generally should have at most 20-30 queries, any more and it's going to seriously affect performance. Most pages should only have 5-10 queries. You want to reduce the number of queries because round trip is the number one killer of database performance. In general one big query is faster than 100 small queries.
The number two killer of database performance is much rarer a problem, though it sometimes arises because of techniques that reduces the number of queries. Your query might simply be too big, if this is the case, you should use defer() or only() so you don't load large fields that you know you won't be using.
When in doubt, use raw SQL. That's a completely valid optimization in Django world.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With