Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django ORM subquery with window function

I'm trying to do this query with Django's ORM:

SELECT
  id,
  pn,
  revision,
  description
FROM (SELECT
        id,
        pn,
        revision,
        MAX(revision)
        OVER (
          PARTITION BY pn ) max_rev,
        description
      FROM table) maxarts
WHERE revision = max_rev

The result needs to be a queryset, i have tried every combination of Window/OuterRef/Subquery i know with no success. Do i have to use a raw query?

Thanks in advance Marco

EDIT #1:

I'll try to explain better, i have a model that looks like this:

class Article(models.Model):
  pn = models.CharField()
  revision = models.CharField()
  description = models.CharField()

  class Meta:
    unique_together = [("pn", "revision"), ]

The data is something like:

pn1    rev1    description
pn1    rev2    description
pn2    rev1    anotherdescription
pn1    rev3    description
pn2    rev2    anotherdescription

I need to have a queryset containing only the Max("revision") value, which increments every time a user make a modfication to the object. I hope that is more clear now. Thanks!

EDIT #2

As suggested i'm writing what i've already tried:

Raw SQL using the query written in the first message, selecting only the id field and passing it to the ORM as id__in=ids. Slow as hell, unusable.

Declared a WIndow function to use as filter:

Article.objects.annotate(
    max_rev=Window(expression=Max("revision"), partition_by=F("pn"))
).filter(revision=F("max_rev"))

But Django complained that i cannot use a window function in a where clause (that's correct).

Then i've tried to use the window as subquery:

window_query = Article.objects.annotate(
    max_rev=Window(expression=Max("revision"), partition_by=F("pn"))
)
result = Article.objects.filter(revision=Subquery(window_query))

I've tried also with OuterRef, to use the max_rev annotation as a join, no luck. I'm out of ideas!


1 Answers

I think you can get what you are after, not much different to what you had, by using FirstValue rather than Max:

>>> window_query = Article.objects.annotate(max_id=Window(
        expression=FirstValue("id"),
        partition_by=F("pn"),
        order_by=F("revision").desc()
    )).values("max_id")
>>> list(Article.objects.filter(id__in=Subquery(window_query)))
[<Article: Article object (4)>, <Article: Article object (5)>]

This produces SQL like: SELECT * FROM articles_article WHERE id IN (SELECT FIRST_VALUE(id) OVER (PARTITION BY pn ORDER BY revision DESC) AS max_id FROM articles_article).

The subquery says order your window by revision descending, partitioned by pn, and take the first ID from each partition; then we use that in the parent query to fetch the relevant Articles for those IDs.

On PostgreSQL, you could also do:

>>> Article.objects.order_by('pn', '-revision').distinct('pn')
<QuerySet [<Article: Article object (4)>, <Article: Article object (5)>]>

This produces SQL like SELECT DISTINCT ON (pn) * FROM articles_article ORDER BY pn ASC, revision DESC.

like image 155
M Somerville Avatar answered Feb 15 '26 17:02

M Somerville



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!