Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django queryset union appears not to be working when combined with .annotate()

I have the following queryset:

photos = Photo.objects.all()

I filter out two queries:

a = photos.filter(gallery__name='NGL')
b = photos.filter(gallery__name='NGA')

I add them together, and they form one new, bigger queryset:

c = a | b

Indeed, the length of a + b equals c:

a.count() + b.count() == c.count()
>>> True

So far so good. Yet, if I introduce a .annotate(), the | no longer seems to work:

a = photos.annotate(c=Count('label').exclude(c__lte=4)
b = photos.filter(painting=True)
c = a | b
a.count() + b.count() == c.count()
>>> False

How do I combine querysets, even when .annotate() is being used? Note that query one and two both work as intended in isolation, only when combining them using | does it seem to go wrong.

like image 317
Mitchell van Zuylen Avatar asked Dec 29 '17 21:12

Mitchell van Zuylen


1 Answers

the pipe | or ampersand & to combine querysets actually puts OR or AND to SQL query so it looks like combined.

one = Photo.objects.filter(id=1)
two = Photo.objects.filter(id=2)
combined = one | two

print(combined.query)
>>> ... WHERE ("photo_photo"."id" = 1 OR "photo_photo"."id" = 2)...

But when you combine more filters and excludes you may notice it will give you strange results due to this. So that is why it doesn't match when you compare counts.

If you use .union() you have to have same columns with same data type, so you have to annotate both querysets. Info about .union()

  • SELECT statement within .UNION() must have the same number of columns
  • The columns must also have similar data types
  • The columns in each SELECT statement must also be in the same order

You have to keep in mind, that pythons argument kwargs for indefinite number of arguments are dictionary, so if you want to use annotate with multiple annotations, you can't ensure correct order of columns. Fortunatelly you can solve this with chaining annotate commands.

 # printed query of this won't be consistent
 photo_queryset.annotate(label_count=Count('labels'), tag_count=Count('tags')) 
 # this will always have same order of columns
 photo_queryset.annotate(label_count=Count('labels')).annotate(tag_count=Count('tags'))

Then you can use .union() and it won't mess up results of annotation. Also .union() should be last method, because after .union() you can't use filter like methods. If you want to preserve duplicates, you use .union(qs, all=True) since .union() has default all=False and calls DISTINCT on queryset

photos = Photo.objects.annotate(c=Count('labels'))
one = photos.exclude(c__lte=4)
two = photos.filter(painting=True)
all = one.union(two, all=True)
one.count() + two.count() == all.count()
>>> True

then it should work like you described in question

like image 85
Sajmon Avatar answered Sep 28 '22 05:09

Sajmon