Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I make Django ManyToMany 'through' queries more efficient?

I'm using a ManyToManyField with a 'through' class and this results in a lot of queries when fetching a list of things. I'm wondering if there's a more efficient way.

For example here are some simplified classes describing Books and their several authors, which goes through a Role class (to define roles like "Editor", "Illustrator", etc):

class Person(models.Model):
    first_name = models.CharField(max_length=100)
    last_name = models.CharField(max_length=100)

    @property
    def full_name(self):
        return ' '.join([self.first_name, self.last_name,])

class Role(models.Model):
    name = models.CharField(max_length=50)
    person = models.ForeignKey(Person)
    book = models.ForeignKey(Book)

class Book(models.Model):
    title = models.CharField(max_length=255)
    authors = models.ManyToManyField(Person, through='Role')

    @property
    def authors_names(self):
        names = []
        for role in self.role_set.all():
            person_name = role.person.full_name
            if role.name:
                person_name += ' (%s)' % (role.name,)
            names.append(person_name)
        return ', '.join(names)

If I call Book.authors_names() then I can get a string something like this:

John Doe (Editor), Fred Bloggs, Billy Bob (Illustrator)

It works fine but it does one query to get the Roles for the book, and then another query for every Person. If I'm displaying a list of Books, this adds up to a lot of queries.

Is there a way to do this more efficiently, in a single query per Book, with a join? Or is the only way to use something like batch-select?

(For bonus points... my coding of authors_names() looks a bit clunky - is there a way to make it more elegantly Python-esque?)

like image 737
Phil Gyford Avatar asked Nov 23 '10 11:11

Phil Gyford


2 Answers

This is a pattern I come across often in Django. It's really easy to create properties such as your author_name, and they work great when you display one book, but the number of queries explodes when you want to use the property for many books on a page.

Firstly, you can use select_related to prevent the lookup for every person

  for role in self.role_set.all().select_related(depth=1):
        person_name = role.person.full_name
        if role.name:
            person_name += ' (%s)' % (role.name,)
        names.append(person_name)
    return ', '.join(names)

However, this doesn't solve the problem of looking up the roles for every book.

If you are displaying a list of books, you can look up all the roles for your books in one query, then cache them.

>>> books = Book.objects.filter(**your_kwargs)
>>> roles = Role.objects.filter(book_in=books).select_related(depth=1)
>>> roles_by_book = defaultdict(list)
>>> for role in roles:
...    roles_by_book[role.book].append(books)    

You can then access a book's roles through the roles_by_dict dictionary.

>>> for book in books:
...    book_roles = roles_by_book[book]

You will have to rethink your author_name property to use caching like this.


I'll shoot for the bonus points as well.

Add a method to role to render the full name and role name.

class Role(models.Model):
    ...
    @property
    def name_and_role(self):
        out = self.person.full_name
        if self.name:
            out += ' (%s)' % role.name
        return out

The author_names collapses to a one liner similar to Paulo's suggestion

@property
def authors_names(self):
   return ', '.join([role.name_and_role for role in self.role_set.all() ])
like image 178
Alasdair Avatar answered Nov 14 '22 02:11

Alasdair


I would make authors = models.ManyToManyField(Role) and store fullname at Role.alias, because same person can sign books under distinct pseudonyms.

About the clunky, this:

def authors_names(self):
    names = []
    for role in self.role_set.all():
        person_name = role.person.full_name
        if role.name:
            person_name += ' (%s)' % (role.name,)
        names.append(person_name)
    return ', '.join(names)

Could be:

def authors_names(self):
   return ', '.join([ '%s (%s)' % (role.person.full_name, role.name) 
                 for role in self.role_set.all() ])
like image 35
Paulo Scardine Avatar answered Nov 14 '22 03:11

Paulo Scardine