Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Django: How to annotate M2M or OneToMany fields using a SubQuery?

I have Order objects and OrderOperation objects that represent an action on a Order (creation, modification, cancellation).

Conceptually, an order has 1 to many order operations. Each time there is an operation on the order, the total is computed in this operation. Which means when I need to find an attribute of an order, I just get the last order operation attribute instead, using a Subquery.

The simplified code

class OrderOperation(models.Model):
    order = models.ForeignKey(Order)
    total = DecimalField(max_digits=9, decimal_places=2)

class Order(models.Model)
    # ...

class OrderQuerySet(query.Queryset):

    @staticmethod
    def _last_oo(field):
        return Subquery(OrderOperation.objects
                        .filter(order_id=OuterRef("pk"))
                        .order_by('-id')
                        .values(field)
                        [:1])

    def annotated_total(self):
        return self.annotate(oo_total=self._last_oo('total'))

This way, I can run my_order_total = Order.objects.annotated_total()[0].oo_total. It works great.

The issue

Computing total is easy as it's a simple value. However, when there is a M2M or OneToMany field, this method does not work. For example, using the example above, let's add this field:

class OrderOperation(models.Model):
    order = models.ForeignKey(Order)
    total = DecimalField(max_digits=9, decimal_places=2)
    ordered_articles = models.ManyToManyField(Article,through='orders.OrderedArticle')                                       

Writing something like the following does NOT work as it returns only 1 foreign key (not a list of all the FKs):

def annotated_ordered_articles(self):
    return self.annotate(oo_ordered_articles=self._last_oo('ordered_articles'))

The purpose

The whole purpose is to allow a user to search among all orders, providing a list or articles in input. For example: "Please find all orders containing at least article 42 or article 43", or "Please find all orders containing exactly article 42 and 43", etc.

If I could get something like:

>>> Order.objects.annotated_ordered_articles()[0].oo_ordered_articles
<ArticleQuerySet [<Article: Article42>, <Article: Article43>]>

or even:

>>> Order.objects.annotated_ordered_articles()[0].oo_ordered_articles
[42,43]

That would solve my issue.

My current idea

  • Maybe something like ArrayAgg (I'm using pgSQL) could do the trick, but I'm not sure to understand how to use it in my case.
  • Maybe this has to do with values() method that seems to not be intended to handle M2M and 1TM relations as stated in the doc:

values() and values_list() are both intended as optimizations for a specific use case: retrieving a subset of data without the overhead of creating a model instance. This metaphor falls apart when dealing with many-to-many and other multivalued relations (such as the one-to-many relation of a reverse foreign key) because the “one row, one object” assumption doesn’t hold.

like image 424
David D. Avatar asked Dec 18 '22 18:12

David D.


2 Answers

ArrayAgg will be great if you want to fetch only one variable (ie. name) from all articles. If you need more, there is a better option for that:

prefetch_related

Instead, you can prefetch for each Order, latest OrderOperation as a whole object. This adds the ability to easily get any field from OrderOperation without extra magic.

The only caveat with that is that you will always get a list with one operation or an empty list when there are no operations for selected order.

To do that, you should use prefetch_related queryset model together with Prefetch object and custom query for OrderOperation. Example:

from django.db.models import Max, F, Prefetch

last_order_operation_qs = OrderOperation.objects.annotate(
    lop_pk=Max('order__orderoperation__pk')
).filter(pk=F('lop_pk'))

orders = Order.objects.prefetch_related(
    Prefetch('orderoperation_set', queryset=last_order_operation_qs, to_attr='last_operation')
)

Then you can just use order.last_operation[0].ordered_articles to get all ordered articles for particular order. You can add prefetch_related('ordered_articles') to first queryset to have improved performance and less queries on database.

like image 66
GwynBleidD Avatar answered Apr 06 '23 21:04

GwynBleidD


To my surprise, your idea with ArrayAgg is right on the money. I didn't know there was a way to annotate with an array (and I believe there still isn't for backends other than Postgres).

from django.contrib.postgres.aggregates.general import ArrayAgg

qs = Order.objects.annotate(oo_articles=ArrayAgg(
            'order_operation__ordered_articles__id',
            'DISTINCT'))

You can then filter the resulting queryset using the ArrayField lookups:

# Articles that contain the specified array
qs.filter(oo_articles__contains=[42,43])
# Articles that are identical to the specified array
qs.filter(oo_articles=[42,43,44])
# Articles that are contained in the specified array
qs.filter(oo_articles__contained_by=[41,42,43,44,45])
# Articles that have at least one element in common
# with the specified array
qs.filter(oo_articles__overlap=[41,42])

'DISTINCT' is needed only if the operation may contain duplicate articles.

You may need to tweak the exact name of the field passed to the ArrayAgg function. For subsequent filtering to work, you may also need to cast id fields in the ArrayAgg to int as otherwise Django casts the id array to ::serial[], and my Postgres complained about type "serial[]" does not exist:

from django.db.models import IntegerField
from django.contrib.postgres.fields.array import ArrayField
from django.db.models.functions import Cast

ArrayAgg(Cast('order_operation__ordered_articles__id', IntegerField()))
# OR
Cast(ArrayAgg('order_operation__ordered_articles__id'), ArrayField(IntegerField()))

Looking at your posted code more closely, you'll also have to filter on the one OrderOperation you are interested in; the query above looks at all operations for the relevant order.

like image 30
Endre Both Avatar answered Apr 06 '23 20:04

Endre Both