Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Optimizing Django: nested queries vs relation lookups

I have a legacy code which uses nested ORM query, which produces SQL SELECT query with JOIN, and conditions which also contains SELECT and JOIN. Execution of this query takes enormous time. By the way, when I execute this query in raw SQL, taken from Django_ORM_query.query, it performs with reasonable time.

What are best practices for optimization in such cases?
Would the query perform faster if I will use ManyToMany and ForeignKey relations?

like image 724
user2115719 Avatar asked Feb 27 '13 21:02

user2115719


People also ask

How do you monitor and optimize the Django ORM queries?

1. Using select_related() and prefetch_related() functions. Django ORM provides two common methods to avoid the N+1 issue, which are select_related and prefetch_related. The select_related method performs a simple JOIN operation of tables and can be used for foreign key or one-to-one model fields.

Is Django ORM slow?

Django's ORM is fantastic. It's slow because it chooses to be convenient but if it needs to be fast it's just a few slight API calls away. If you're curious, check out the code on Github.

What is Select_related in Django?

Using select_related() Django offers a QuerySet method called select_related() that allows you to retrieve related objects for one-to-many relationships. This translates to a single, more complex QuerySet, but you avoid additional queries when accessing the related objects.


2 Answers

Performance issue in Django is usually caused by following relations in a loop, which causes multiple database queries. If you have django-debug-toolbar installed, you can check for how many queries you're doing and figure out which query needs to be optimized. The debug toolbar also shows you the time of each queries, which is essential for optimizing django, you're missing out a lot if you didn't have it installed or didn't use it.

You'd generally solve the problem of following relations by using select_related() or prefetch_related().

A page generally should have at most 20-30 queries, any more and it's going to seriously affect performance. Most pages should only have 5-10 queries. You want to reduce the number of queries because round trip is the number one killer of database performance. In general one big query is faster than 100 small queries.

The number two killer of database performance is much rarer a problem, though it sometimes arises because of techniques that reduces the number of queries. Your query might simply be too big, if this is the case, you should use defer() or only() so you don't load large fields that you know you won't be using.

like image 165
Lie Ryan Avatar answered Oct 17 '22 17:10

Lie Ryan


When in doubt, use raw SQL. That's a completely valid optimization in Django world.

like image 40
Jack Shedd Avatar answered Oct 17 '22 18:10

Jack Shedd