Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django prefetch_related with limit

Is there a way to tell prefetch_related to only fetch a limited set of related objects? Lets say I am fetching a list of users and I know I want to fetch their recent comments. Instead of fetching comments for each user in a loop, I use prefetch_related to pre-fetch them at the time of fetching the users. My understanding is that this will fetch all the comments made by any user present in the result of the original query but I only want to show the latest 5 for each user.

How does this affect the performance if the list of comments is really huge? Is there a way to fetch only 5 comments for each user in a single (or 2) query? It doesn't have to be the same query as the original one for fetching users but that would be nice.

I essentially want to turn this

   users = User.objects.all()
   for user in users:
       user.comments.all()[:10]

into something like this

 User.objects.all().prefetch_related('comments', limit=10)

so if a user has 100s or 10000s of comments, they are not all loaded into memory. How would you do something like this in raw SQL?

like image 706
Owais Lone Avatar asked Jul 11 '14 05:07

Owais Lone


3 Answers

I think there is a workaround now to in django new version as we have OuterRef and Subquery.

from django.db.models import OuterRef, Subquery, Prefetch

subqry = Subquery(Comment.objects \
    .filter(user_id=OuterRef('user_id')) \
    .values_list('id', flat=True)[:5])

User.objects.prefetch_related(
    Prefetch('comments', queryset=Comment.objects.filter(id__in=subqry)))
like image 185
haseebahmad Avatar answered Oct 17 '22 17:10

haseebahmad


The only way to limit the number of prefetched related objects seems to be using Prefetch() and filtering on fileds. Using sliceing

User.objects.all().prefetch_related(
    Prefetch('msg_sent', queryset=UserMsg.objects.order_by('-created')[:10]))

returns an error

AssertionError: Cannot filter a query once a slice has been taken.

The only way to limit the number of related objects seems to be using filter on a value, for example

from datetime import datetime, timedelta
timelimit = datetime.now() - timedelta(days=365)

User.objects.all().prefetch_related(
    Prefetch('msg_sent', queryset=UserMsg.objects.filter(created__gte=timelimit)))

While that doesn't return a fixed number, in may be useful in some situation, and it will reduce the number of prefetched objects.

like image 38
C14L Avatar answered Oct 17 '22 16:10

C14L


thats what actually works for me django(2.1) (based on haseebahmad answer).
in order for prefetch_related to accept customize queryset: Prefetch
so:

from django.db.models import OuterRef, Subquery ,Prefetch

User.objects.all().prefetch_related(Prefetch('comment_set',  
queryset=Comment.objects.filter(id__in= 
Subquery(Comment.objects.filter(user_id=OuterRef('user_id')).
values_list('id', flat=True)[:1]))))
like image 4
Rafi Avatar answered Oct 17 '22 15:10

Rafi