I'm trying to do this query with Django's ORM:
SELECT
id,
pn,
revision,
description
FROM (SELECT
id,
pn,
revision,
MAX(revision)
OVER (
PARTITION BY pn ) max_rev,
description
FROM table) maxarts
WHERE revision = max_rev
The result needs to be a queryset, i have tried every combination of Window/OuterRef/Subquery i know with no success. Do i have to use a raw query?
Thanks in advance Marco
EDIT #1:
I'll try to explain better, i have a model that looks like this:
class Article(models.Model):
pn = models.CharField()
revision = models.CharField()
description = models.CharField()
class Meta:
unique_together = [("pn", "revision"), ]
The data is something like:
pn1 rev1 description
pn1 rev2 description
pn2 rev1 anotherdescription
pn1 rev3 description
pn2 rev2 anotherdescription
I need to have a queryset containing only the Max("revision") value, which increments every time a user make a modfication to the object. I hope that is more clear now. Thanks!
EDIT #2
As suggested i'm writing what i've already tried:
Raw SQL using the query written in the first message, selecting only the id field and passing it to the ORM as id__in=ids. Slow as hell, unusable.
Declared a WIndow function to use as filter:
Article.objects.annotate(
max_rev=Window(expression=Max("revision"), partition_by=F("pn"))
).filter(revision=F("max_rev"))
But Django complained that i cannot use a window function in a where clause (that's correct).
Then i've tried to use the window as subquery:
window_query = Article.objects.annotate(
max_rev=Window(expression=Max("revision"), partition_by=F("pn"))
)
result = Article.objects.filter(revision=Subquery(window_query))
I've tried also with OuterRef, to use the max_rev annotation as a join, no luck. I'm out of ideas!
I think you can get what you are after, not much different to what you had, by using FirstValue rather than Max:
>>> window_query = Article.objects.annotate(max_id=Window(
expression=FirstValue("id"),
partition_by=F("pn"),
order_by=F("revision").desc()
)).values("max_id")
>>> list(Article.objects.filter(id__in=Subquery(window_query)))
[<Article: Article object (4)>, <Article: Article object (5)>]
This produces SQL like: SELECT * FROM articles_article WHERE id IN (SELECT FIRST_VALUE(id) OVER (PARTITION BY pn ORDER BY revision DESC) AS max_id FROM articles_article).
The subquery says order your window by revision descending, partitioned by pn, and take the first ID from each partition; then we use that in the parent query to fetch the relevant Articles for those IDs.
On PostgreSQL, you could also do:
>>> Article.objects.order_by('pn', '-revision').distinct('pn')
<QuerySet [<Article: Article object (4)>, <Article: Article object (5)>]>
This produces SQL like SELECT DISTINCT ON (pn) * FROM articles_article ORDER BY pn ASC, revision DESC.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With