This is a bleeding-edge feature that I'm currently skewered upon and quickly bleeding out. I want to annotate a subquery-aggregate onto an existing queryset. Doing this before 1.11 either meant custom SQL or hammering the database. Here's the documentation for this, and the example from it: <pre class="prettyprint"><code>from django.db.models import OuterRef, Subquery, Sum comments = Comment.objects.filter(post=OuterRef('pk')).values('post') total_comments = comments.annotate(total=Sum('length')).values('total') Post.objects.filter(length__gt=Subquery(total_comments)) </code></pre> They're annotating on the aggregate, which seems weird to me, but whatever. I'm struggling with this so I'm boiling it right back to the simplest real-world example I have data for. I have <code>Carpark</code>s which contain many <code>Space</code>s. Use <code>Book→Author</code> if that makes you happier but —for now— I just want to annotate on a count of the related model using <code>Subquery</code>*. <pre class="prettyprint"><code>spaces = Space.objects.filter(carpark=OuterRef('pk')).values('carpark') count_spaces = spaces.annotate(c=Count('*')).values('c') Carpark.objects.annotate(space_count=Subquery(count_spaces)) </code></pre> This gives me a lovely <code>ProgrammingError: more than one row returned by a subquery used as an expression</code> and in my head, this error makes perfect sense. The subquery is returning a list of spaces with the annotated-on total. The example suggested that some sort of magic would happen and I'd end up with a number I could use. But that's not happening here? How do I annotate on aggregate Subquery data? <h3>Hmm, something's being added to my query's SQL...</h3> I built a new Carpark/Space model and it worked. So the next step is working out what's poisoning my SQL. On Laurent's advice, I took a look at the SQL and tried to make it more like the version they posted in their answer. And this is where I found the real problem: <pre class="prettyprint"><code>SELECT "bookings_carpark".*, (SELECT COUNT(U0."id") AS "c" FROM "bookings_space" U0 WHERE U0."carpark_id" = ("bookings_carpark"."id") GROUP BY U0."carpark_id", U0."space" ) AS "space_count" FROM "bookings_carpark";</code></pre> I've highlighted it but it's that subquery's <code>GROUP BY ... U0."space"</code>. It's retuning both for some reason. Investigations continue. Edit 2: Okay, just looking at the subquery SQL I can see that second group by coming through ☹ <pre class="prettyprint"><code>In [12]: print(Space.objects_standard.filter().values('carpark').annotate(c=Count('*')).values('c').query) SELECT COUNT(*) AS "c" FROM "bookings_space" GROUP BY "bookings_space"."carpark_id", "bookings_space"."space" ORDER BY "bookings_space"."carpark_id" ASC, "bookings_space"."space" ASC </code></pre> Edit 3: Okay! Both these models have sort orders. These are being carried through to the subquery. It's these orders that are bloating out my query and breaking it. I guess this might be a bug in Django but short of removing the Meta-order_by on both these models, is there any way I can unsort a query at querytime? <hr> *I know I could just annotate a Count for this example. My real purpose for using this is a much more complex filter-count but I can't even get this working.

Shazaam! Per my edits, an additional column was being output from my subquery. This was to facilitate ordering (which just isn't required in a COUNT). I just needed to remove the prescribed meta-order from the model. You can do this by just adding an empty <code>.order_by()</code> to the subquery. In my code terms that meant: <pre class="prettyprint"><code>from django.db.models import Count, OuterRef, Subquery spaces = Space.objects.filter(carpark=OuterRef('pk')).order_by().values('carpark') count_spaces = spaces.annotate(c=Count('*')).values('c') Carpark.objects.annotate(space_count=Subquery(count_spaces)) </code></pre> And that works. Superbly. So annoying.

It's also possible to create a subclass of <code>Subquery</code>, that changes the SQL it outputs. For instance, you can use: <pre class="prettyprint"><code>class SQCount(Subquery): template = "(SELECT count(*) FROM (%(subquery)s) _count)" output_field = models.IntegerField() </code></pre> You then use this as you would the original <code>Subquery</code> class: <pre class="prettyprint"><code>spaces = Space.objects.filter(carpark=OuterRef('pk')).values('pk') Carpark.objects.annotate(space_count=SQCount(spaces)) </code></pre> You can use this trick (at least in postgres) with a range of aggregating functions: I often use it to build up an array of values, or sum them.

Django 1.11 Annotating a Subquery Aggregate

Tags:

django

django-aggregation

django-annotate

django-subquery

This is a bleeding-edge feature that I'm currently skewered upon and quickly bleeding out. I want to annotate a subquery-aggregate onto an existing queryset. Doing this before 1.11 either meant custom SQL or hammering the database. Here's the documentation for this, and the example from it:

from django.db.models import OuterRef, Subquery, Sum comments = Comment.objects.filter(post=OuterRef('pk')).values('post') total_comments = comments.annotate(total=Sum('length')).values('total') Post.objects.filter(length__gt=Subquery(total_comments))

They're annotating on the aggregate, which seems weird to me, but whatever.

I'm struggling with this so I'm boiling it right back to the simplest real-world example I have data for. I have Carparks which contain many Spaces. Use Book→Author if that makes you happier but —for now— I just want to annotate on a count of the related model using Subquery*.

spaces = Space.objects.filter(carpark=OuterRef('pk')).values('carpark') count_spaces = spaces.annotate(c=Count('*')).values('c') Carpark.objects.annotate(space_count=Subquery(count_spaces))

This gives me a lovely ProgrammingError: more than one row returned by a subquery used as an expression and in my head, this error makes perfect sense. The subquery is returning a list of spaces with the annotated-on total.

The example suggested that some sort of magic would happen and I'd end up with a number I could use. But that's not happening here? How do I annotate on aggregate Subquery data?

Hmm, something's being added to my query's SQL...

I built a new Carpark/Space model and it worked. So the next step is working out what's poisoning my SQL. On Laurent's advice, I took a look at the SQL and tried to make it more like the version they posted in their answer. And this is where I found the real problem:

SELECT "bookings_carpark".*, (SELECT COUNT(U0."id") AS "c" FROM "bookings_space" U0 WHERE U0."carpark_id" = ("bookings_carpark"."id") GROUP BY U0."carpark_id", U0."space" ) AS "space_count" FROM "bookings_carpark";

I've highlighted it but it's that subquery's GROUP BY ... U0."space". It's retuning both for some reason. Investigations continue.

Edit 2: Okay, just looking at the subquery SQL I can see that second group by coming through ☹

In [12]: print(Space.objects_standard.filter().values('carpark').annotate(c=Count('*')).values('c').query) SELECT COUNT(*) AS "c" FROM "bookings_space" GROUP BY "bookings_space"."carpark_id", "bookings_space"."space" ORDER BY "bookings_space"."carpark_id" ASC, "bookings_space"."space" ASC

Edit 3: Okay! Both these models have sort orders. These are being carried through to the subquery. It's these orders that are bloating out my query and breaking it.

I guess this might be a bug in Django but short of removing the Meta-order_by on both these models, is there any way I can unsort a query at querytime?

_{*I know I could just annotate a Count for this example. My real purpose for using this is a much more complex filter-count but I can't even get this working.}

849

asked Mar 01 '17 23:03

Oli

2 Answers

Shazaam! Per my edits, an additional column was being output from my subquery. This was to facilitate ordering (which just isn't required in a COUNT).

I just needed to remove the prescribed meta-order from the model. You can do this by just adding an empty .order_by() to the subquery. In my code terms that meant:

from django.db.models import Count, OuterRef, Subquery  spaces = Space.objects.filter(carpark=OuterRef('pk')).order_by().values('carpark') count_spaces = spaces.annotate(c=Count('*')).values('c') Carpark.objects.annotate(space_count=Subquery(count_spaces))

And that works. Superbly. So annoying.

answered Sep 23 '22 08:09

Oli

It's also possible to create a subclass of Subquery, that changes the SQL it outputs. For instance, you can use:

class SQCount(Subquery):     template = "(SELECT count(*) FROM (%(subquery)s) _count)"     output_field = models.IntegerField()

You then use this as you would the original Subquery class:

spaces = Space.objects.filter(carpark=OuterRef('pk')).values('pk') Carpark.objects.annotate(space_count=SQCount(spaces))

You can use this trick (at least in postgres) with a range of aggregating functions: I often use it to build up an array of values, or sum them.

answered Sep 24 '22 08:09

Matthew Schinckel

Related questions
                            
                                django: how do I query based on GenericForeignKey's fields?
                            
                                Django allauth social login: automatically linking social site profiles using the registered email
                            
                                Upload image available at public URL to S3 using boto
                            
                                How to make a POST simple JSON using Django REST Framework? CSRF token missing or incorrect
                            
                                Manipulating Data in Django's Admin Panel on Save
                            
                                How to keep all my django applications in specific folder
                            
                                Django model: NULLable field
                            
                                How to refer to static files in my css files?
                            
                                Collecting staticfiles throws ImproperlyConfigured
                            
                                Django edit form based on add form?
                            
                                Django "get() got an unexpected keyword argument 'pk'" error
                            
                                Track the number of "page views" or "hits" of an object?
                            
                                connect to a DB using psycopg2 without password
                            
                                prefetch_related for multiple Levels
                            
                                Django query case-insensitive list match
                            
                                Error 111 connecting to localhost:6379. Connection refused. Django Heroku
                            
                                django-orm case-insensitive order by
                            
                                Get a list of all installed applications in Django and their attributes
                            
                                how to add annotate data in django-rest-framework queryset responses?
                            
                                Page not found 404 Django media files

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With