Given a model <pre class="prettyprint"><code>class Entity(models.Model): identifier = models.IntegerField() created = models.IntegerField() content = models.IntegerField() class Meta: unique_together = (('identifier', 'created')) </code></pre> I would like to query for all objects with <code>created</code> being maximal among objects with common <code>identifier</code>. In SQL a window function in a subquery solves the problem: <pre class="prettyprint"><code>SELECT identifier, content FROM entity WHERE (identifier, created) IN (SELECT identifier, max(created) OVER (PARTITION BY identifier) FROM entity); </code></pre> See also: http://sqlfiddle.com/#!17/c541f/1/0 Both window functions and subqueries are available in Django 2.0. However, I have not found a way to express subquery expressions with multiple columns. Is there a way to translate that SQL query into the Django QuerySet world? Is this maybe a an XY problem and my problem can be solved differently? My ugly workaround is <pre class="prettyprint"><code>Entity.objects.raw(''' SELECT * FROM app_entity e WHERE e.created = (SELECT max(f.created) FROM app_entity f WHERE e.identifier = f.identifier)''') </code></pre> since the underlying sqlite3 version apparently cannot handle multi-column subqueries.

I think you can do it another way (but I'm not sure if it will perform better or worse than a window expression)... <pre class="prettyprint"><code>max_created = Entity.objects.filter( identifier=OuterRef('identifier') ).order_by('-created').values('created')[:1] Entity.objects.filter( created=Subquery(max_created) ) </code></pre> This grabs the largest <code>created</code> value for a given <code>identifier</code>, as a correlated subquery, and then filters for only those that match. This may need tweaking: I'm not sure if you can filter on the subquery like that, or if you need to <code>.annotate(max_created=Subquery(created)).filter(created=F('max_created'))</code> or something else horrible like that. Also, if you are on postgres, you can use the <code>DISTINCT ON</code> feature to get a really neat solution: <pre class="prettyprint"><code>Entity.objects.order_by('identifier', '-created').distinct('identifier') </code></pre>

Django QuerySet Two-Valued Subquery

Given a model

class Entity(models.Model):
    identifier = models.IntegerField()
    created = models.IntegerField()
    content = models.IntegerField()

    class Meta:
        unique_together = (('identifier', 'created'))

I would like to query for all objects with created being maximal among objects with common identifier.

In SQL a window function in a subquery solves the problem:

SELECT identifier, content
  FROM entity
  WHERE (identifier, created)
    IN (SELECT identifier, max(created) OVER (PARTITION BY identifier)
          FROM entity);

See also: http://sqlfiddle.com/#!17/c541f/1/0

Both window functions and subqueries are available in Django 2.0. However, I have not found a way to express subquery expressions with multiple columns.

Is there a way to translate that SQL query into the Django QuerySet world? Is this maybe a an XY problem and my problem can be solved differently?

My ugly workaround is

Entity.objects.raw('''
SELECT * FROM app_entity e
 WHERE e.created = (SELECT max(f.created) FROM app_entity f WHERE e.identifier = f.identifier)''')

since the underlying sqlite3 version apparently cannot handle multi-column subqueries.

Does Django ORM support subquery?

Django allows using SQL subqueries.

How do I add one Queryset to another?

Use union operator for queryset | to take union of two queryset. If both queryset belongs to same model / single model than it is possible to combine querysets by using union operator. One other way to achieve combine operation between two queryset is to use itertools chain function.

What is OuterRef?

OuterRef: It acts like an F expression except that the check to see if it refers to a valid field isn't made until the outer queryset is resolved. I am experiencing an issue with this using the following example: class ExampleModel(models.Model): date = models.DateField()

What is F in Django Queryset?

In the Django QuerySet API, F() expressions are used to refer to model field values directly in the database.

I think you can do it another way (but I'm not sure if it will perform better or worse than a window expression)...

max_created = Entity.objects.filter(
    identifier=OuterRef('identifier')
).order_by('-created').values('created')[:1]

Entity.objects.filter(
    created=Subquery(max_created)
)

This grabs the largest created value for a given identifier, as a correlated subquery, and then filters for only those that match.

This may need tweaking: I'm not sure if you can filter on the subquery like that, or if you need to .annotate(max_created=Subquery(created)).filter(created=F('max_created')) or something else horrible like that.

Also, if you are on postgres, you can use the DISTINCT ON feature to get a really neat solution:

Entity.objects.order_by('identifier', '-created').distinct('identifier')

Django QuerySet Two-Valued Subquery

Tags:

Philipp Matthias Schäfer

People also ask

1 Answers

Matthew Schinckel

Recent Activity

Donate For Us

Django QuerySet Two-Valued Subquery

Tags:

Philipp Matthias Schäfer

People also ask

1 Answers

Matthew Schinckel

Related questions

Recent Activity

Donate For Us