Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I cut down the number of queries?

Tags:

python

django

This code is currently executing about 50 SQL queries:

c = Category.objects.all()

categories_w_rand_books = []

for category in c:
    r = Book.objects.filter(author__category=category).order_by('?')[:5]

    categories_w_rand_books.append((category, r))

I need to cut down the number of used queries to the minimum to speed up things and do not cause server load.

Basically, I have three models: Category, Author, Book. The Author belong to the Category (not books) and I need to get a list of all categories with 5 random books under each one.

like image 692
Adam Silver Avatar asked Dec 31 '12 02:12

Adam Silver


1 Answers

If you prefer single query and are using MySQL, check the excellent link provided by @Crazyshezy in his comment.
For PostgreSQL backends, a possible query is (assuming there are non-nullable FK relationships from Book to Author and from Author to Category):

SELECT * FROM (
    SELECT book_table.*, row_number() OVER (PARTITION BY category_id ORDER BY RANDOM()) AS rn 
    FROM book_table INNER JOIN author_table ON book_table.author_id = author_table.id
) AS sq 
WHERE rn <= 5 

You could then wrap it inside a RawQuerySet to get Book instances

from collections import defaultdict
qs = Book.objects.raw("""The above sql suited for your tables...""")
collection = defaultdict(list)
for obj in qs:
    collection[obj.category_id].append(obj)

categories_w_rand_books = []
for category in c:
    categories_w_rand_books.append((category, collection[category.id]))

You may not want to run this query for each request directly w/o some caching.

Furthermore, your code generates at most 50*5=250 Books, randomly, I just wonder why because it seems too many for a single page. Are items displayed as tabs or something else? Perhaps you could reduce the counts of SQLs by doing Ajax, or simplify the requirement?

Update

To use book.author w/o triggering more than another query, try prefetch_related_objects

from django.db.models.query import prefetch_related_objects
qs = list(qs) # have to evaluate at first
prefetch_related_objects(qs, ['author'])
# now instances inside qs already contain cached author instances, and
qs[0].author # will not trigger an extra query

The above code prefetches authors in batch and fills them into the qs. This just adds another query.

like image 183
okm Avatar answered Sep 21 '22 14:09

okm