Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django ORM values_list with '__in' filter performance

What is the preferred way to filter query set with '__in' in Django?

providers = Provider.objects.filter(age__gt=10)
consumers = Consumer.objects.filter(consumer__in=providers)

or

providers_ids = Provider.objects.filter(age__gt=10).values_list('id', flat=True)
consumers = Consumer.objects.filter(consumer__in=providers_ids)
like image 457
Pavel Patrin Avatar asked Dec 07 '22 00:12

Pavel Patrin


2 Answers

These should be totally equivalent. Underneath the hood Django will optimize both of these to a subselect query in SQL. See the QuerySet API reference on in:

This queryset will be evaluated as subselect statement:

SELECT ... WHERE consumer.id IN (SELECT id FROM ... WHERE _ IN _)

However you can force a lookup based on passing in explicit values for the primary keys by calling list on your values_list, like so:

providers_ids = list(Provider.objects.filter(age__gt=10).values_list('id', flat=True))
consumers = Consumer.objects.filter(consumer__in=providers_ids)

This could be more performant in some cases, for example, when you have few providers, but it will be totally dependent on what your data is like and what database you're using. See the "Performance Considerations" note in the link above.

like image 125
Wilduck Avatar answered Dec 10 '22 12:12

Wilduck


I Agree with Wilduck. However couple of notes

You can combine a filter such as these into one like this:

consumers = Consumer.objects.filter(consumer__age__gt=10)

This would give you the same result set - in a single query.

The second thing, to analyze the generated query, you can use the .query clause at the end.

Example:

print Provider.objects.filter(age__gt=10).query

would print the query the ORM would be generating to fetch the resultset.

like image 36
karthikr Avatar answered Dec 10 '22 10:12

karthikr