UPDATE: An Open Ticked about this issue: 24272
What's all about?
Django has a GenericRelation class, which adds a “reverse” generic relationship to enable an additional API.
It turns out we can use this reverse-generic-relation
for filtering
or ordering
, but we can't use it inside prefetch_related
.
I was wondering if this is a bug, or its not supposed to work, or its something that can be implemented in the feature.
Let me show you with some examples what I mean.
Lets say we have two main models: Movies
and Books
.
Movies
have a Director
Books
have an Author
And we want to assign tags to our Movies
and Books
, but instead of using MovieTag
and BookTag
models, we want to use a single TaggedItem
class with a GFK
to Movie
or Book
.
Here is the model structure:
from django.db import models from django.contrib.contenttypes.fields import GenericForeignKey, GenericRelation from django.contrib.contenttypes.models import ContentType class TaggedItem(models.Model): tag = models.SlugField() content_type = models.ForeignKey(ContentType) object_id = models.PositiveIntegerField() content_object = GenericForeignKey('content_type', 'object_id') def __unicode__(self): return self.tag class Director(models.Model): name = models.CharField(max_length=100) def __unicode__(self): return self.name class Movie(models.Model): name = models.CharField(max_length=100) director = models.ForeignKey(Director) tags = GenericRelation(TaggedItem, related_query_name='movies') def __unicode__(self): return self.name class Author(models.Model): name = models.CharField(max_length=100) def __unicode__(self): return self.name class Book(models.Model): name = models.CharField(max_length=100) author = models.ForeignKey(Author) tags = GenericRelation(TaggedItem, related_query_name='books') def __unicode__(self): return self.name
And some initial data:
>>> from tags.models import Book, Movie, Author, Director, TaggedItem >>> a = Author.objects.create(name='E L James') >>> b1 = Book.objects.create(name='Fifty Shades of Grey', author=a) >>> b2 = Book.objects.create(name='Fifty Shades Darker', author=a) >>> b3 = Book.objects.create(name='Fifty Shades Freed', author=a) >>> d = Director.objects.create(name='James Gunn') >>> m1 = Movie.objects.create(name='Guardians of the Galaxy', director=d) >>> t1 = TaggedItem.objects.create(content_object=b1, tag='roman') >>> t2 = TaggedItem.objects.create(content_object=b2, tag='roman') >>> t3 = TaggedItem.objects.create(content_object=b3, tag='roman') >>> t4 = TaggedItem.objects.create(content_object=m1, tag='action movie')
So as the docs show we can do stuff like this.
>>> b1.tags.all() [<TaggedItem: roman>] >>> m1.tags.all() [<TaggedItem: action movie>] >>> TaggedItem.objects.filter(books__author__name='E L James') [<TaggedItem: roman>, <TaggedItem: roman>, <TaggedItem: roman>] >>> TaggedItem.objects.filter(movies__director__name='James Gunn') [<TaggedItem: action movie>] >>> Book.objects.all().prefetch_related('tags') [<Book: Fifty Shades of Grey>, <Book: Fifty Shades Darker>, <Book: Fifty Shades Freed>] >>> Book.objects.filter(tags__tag='roman') [<Book: Fifty Shades of Grey>, <Book: Fifty Shades Darker>, <Book: Fifty Shades Freed>]
But, if we try to prefetch
some related data
of TaggedItem
via this reverse generic relation
, we are going to get an AttributeError.
>>> TaggedItem.objects.all().prefetch_related('books') Traceback (most recent call last): ... AttributeError: 'Book' object has no attribute 'object_id'
Some of you may ask, why I just don't use content_object
instead of books
here? The reason is, because this only work when we want to:
1) prefetch
only one level deep from querysets
containing different type of content_object
.
>>> TaggedItem.objects.all().prefetch_related('content_object') [<TaggedItem: roman>, <TaggedItem: roman>, <TaggedItem: roman>, <TaggedItem: action movie>]
2) prefetch
many levels but from querysets
containing only one type of content_object
.
>>> TaggedItem.objects.filter(books__author__name='E L James').prefetch_related('content_object__author') [<TaggedItem: roman>, <TaggedItem: roman>, <TaggedItem: roman>]
But, if we want both 1) and 2) (to prefetch
many levels from queryset
containing different types of content_objects
, we can't use content_object
.
>>> TaggedItem.objects.all().prefetch_related('content_object__author') Traceback (most recent call last): ... AttributeError: 'Movie' object has no attribute 'author_id'
Django
thinks that all content_objects
are Books
, and thus they have an Author
.
Now imagine the situation where we want to prefetch
not only the books
with their author
, but also the movies
with their director
. Here are few attempts.
The silly way:
>>> TaggedItem.objects.all().prefetch_related( ... 'content_object__author', ... 'content_object__director', ... ) Traceback (most recent call last): ... AttributeError: 'Movie' object has no attribute 'author_id'
Maybe with custom Prefetch
object?
>>> >>> TaggedItem.objects.all().prefetch_related( ... Prefetch('content_object', queryset=Book.objects.all().select_related('author')), ... Prefetch('content_object', queryset=Movie.objects.all().select_related('director')), ... ) Traceback (most recent call last): ... ValueError: Custom queryset can't be used for this lookup.
Some solutions of this problem are shown here. But that's a lot of massage over the data which I want to avoid. I really like the API coming from the reversed generic relations
, it would be very nice to be able to do prefetchs
like that:
>>> TaggedItem.objects.all().prefetch_related( ... 'books__author', ... 'movies__director', ... ) Traceback (most recent call last): ... AttributeError: 'Book' object has no attribute 'object_id'
Or like that:
>>> TaggedItem.objects.all().prefetch_related( ... Prefetch('books', queryset=Book.objects.all().select_related('author')), ... Prefetch('movies', queryset=Movie.objects.all().select_related('director')), ... ) Traceback (most recent call last): ... AttributeError: 'Book' object has no attribute 'object_id'
But as you can see, we aways get that AttributeError. I'm using Django 1.7.3
and Python 2.7.6
. And i'm curious why Django is throwing that error? Why is Django searching for an object_id
in the Book
model? Why I think this may be a bug? Usually when we ask prefetch_related
to resolve something it can't, we see:
>>> TaggedItem.objects.all().prefetch_related('some_field') Traceback (most recent call last): ... AttributeError: Cannot find 'some_field' on TaggedItem object, 'some_field' is an invalid parameter to prefetch_related()
But here, it is different. Django actually tries to resolve the relation... and fails. Is this a bug which should be reported? I have never reported anything to Django so that's why I'm asking here first. I'm unable to trace the error and decide for myself if this is a bug, or a feature which could be implemented.
In Django, select_related and prefetch_related are designed to stop the deluge of database queries that are caused by accessing related objects. In this article, we will see how it reduces the number of queries and make the program much faster.
Basically it's a built in app that keeps track of models from the installed apps of your Django application. And one of the use cases of the ContentTypes is to create generic relationships between models.
If you want to retrieve Book
instances and prefetch the related tags use Book.objects.prefetch_related('tags')
. No need to use the reverse relation here.
You can also have a look at the related tests in the Django source code.
Also the Django documentation states that prefetch_related()
is supposed to work with GenericForeignKey
and GenericRelation
:
prefetch_related
, on the other hand, does a separate lookup for each relationship, and does the ‘joining’ in Python. This allows it to prefetch many-to-many and many-to-one objects, which cannot be done using select_related, in addition to the foreign key and one-to-one relationships that are supported by select_related. It also supports prefetching ofGenericRelation
andGenericForeignKey
.
UPDATE: To prefetch the content_object
for a TaggedItem
you can use TaggedItem.objects.all().prefetch_related('content_object')
, if you want to limit the result to only tagged Book
objects you could additionally filter for the ContentType
(not sure if prefetch_related
works with the related_query_name
). If you also want to get the Author
together with the book you need to use select_related()
not prefetch_related()
as this is a ForeignKey
relationship, you can combine this in a custom prefetch_related()
query:
from django.contrib.contenttypes.models import ContentType from django.db.models import Prefetch book_ct = ContentType.objects.get_for_model(Book) TaggedItem.objects.filter(content_type=book_ct).prefetch_related( Prefetch( 'content_object', queryset=Book.objects.all().select_related('author') ) )
prefetch_related_objects
to the rescue.
Starting from Django 1.10 (Note: it still presents in the previous versions, but was not part of the public API.), we can use prefetch_related_objects to divide and conquer our problem.
prefetch_related
is an operation, where Django fetches related data after the queryset has been evaluated (doing a second query after the main one has been evaluated). And in order to work, it expects the items in the queryset to be homogeneous (the same type). The main reason the reverse generic generation does not work right now is that we have objects from different content types, and the code is not yet smart enough to separate the flow for different content types.
Now using prefetch_related_objects
we do fetches only on a subset of our queryset where all the items will be homogeneous. Here is an example:
from django.db import models from django.db.models.query import prefetch_related_objects from django.core.paginator import Paginator from django.contrib.contenttypes.models import ContentType from tags.models import TaggedItem, Book, Movie tagged_items = TaggedItem.objects.all() paginator = Paginator(tagged_items, 25) page = paginator.get_page(1) # prefetch books with their author # do this only for items where # tagged_item.content_object is a Book book_ct = ContentType.objects.get_for_model(Book) tags_with_books = [item for item in page.object_list if item.content_type_id == book_ct.id] prefetch_related_objects(tags_with_books, "content_object__author") # prefetch movies with their director # do this only for items where # tagged_item.content_object is a Movie movie_ct = ContentType.objects.get_for_model(Movie) tags_with_movies = [item for item in page.object_list if item.content_type_id == movie_ct.id] prefetch_related_objects(tags_with_movies, "content_object__director") # This will make 5 queries in total # 1 for page items # 1 for books # 1 for book authors # 1 for movies # 1 for movie directors # Iterating over items wont make other queries for item in page.object_list: # do something with item.content_object # and item.content_object.author/director print( item, item.content_object, getattr(item.content_object, 'author', None), getattr(item.content_object, 'director', None) )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With