Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can Django ORM calculate market share of specific brands over multiple time periods in a single query?

Tags:

orm

django

I have the following Django model:

class MarketData(models.Model):
    category = models.CharField(max_length=64, null=True)
    brand = models.CharField(max_length=128, null=True)
    time_period = models.DateField()
    sales = models.IntegerField(default=0, null=True)

I am trying to use Django's ORM to calculate the market share of each brand within each time period of my dataset to output to a line chart. The market share calculation itself is simple: the sales of each brand within the time period divided by the total sales in each time period.

I can use Django's ORM to generate a queryset of brands and sales by time period, but I'm struggling to annotate the additional metric of sales for all brands.

My current query:

sales = MarketData.objects.filter(brand__in=['Brand 1', 'Brand 2']) \
                          .values('time_period', 'brand') \
                          .annotate(sales=Sum('sales')) \
                          .order_by('period', '-sales')

My current queryset output: <QuerySet [{'time_period': datetime.date(2020, 6, 30), 'brand': "Brand 1", 'sales': 21734}, {'time_period': datetime.date(2020, 6, 30), 'brand': 'Brand 2', 'sales': 93622}]>

How can I add another annotation that includes the sales of ALL brands for that time period, not just the brands specified in the filter, and not just the current brand?

I have tried to use a Subquery to get the total market sales by period, but I end up with an aggregate across all periods, or an ungodly-long SQL statement that takes 13 seconds to run. Roughly, the SQL statement that I want is the following, which takes 0.2 seconds:

SELECT
    brand
    b.time_period,
    SUM(sales) / market_total AS share
FROM market_data AS b
JOIN (
    -- Create sub-query of total market sales for each period
    SELECT
        time_period,
        SUM(sales) AS market_total
    FROM marketdata
    GROUP BY
        time_period
    ) AS m ON m.time_period = b.time_period
WHERE
    brand IN ('Brand 1', 'Brand 2')
    AND b.time_period BETWEEN '2020-01-01' AND '2023-03-31'
GROUP BY
    brand,
    b.time_period,
    market_total
ORDER BY
    brand,
    b.time_period

but I can't figure out a way to generate this using Django's ORM. I would like to use the ORM instead of raw SQL because I have some additional WHERE clauses that I want to include conditionally, and it would be significantly easier to do the filtering using the ORM within 'if' statements.

like image 284
Daniel Avatar asked Dec 11 '25 11:12

Daniel


1 Answers

I believe you can achieve that with a subquery:

subquery = MarketData.objects.annotate(total_sales=Sum('sales')).values("total_sales")[:1]

sales = MarketData.objects.filter(brand__in=['Brand 1', 'Brand 2']) \
                          .values('time_period', 'brand') \
                          .annotate(
                            sales=Sum('sales'),
                            total_brands_sales=Subquery(subquery)
                            )\
                          .order_by('period', '-sales')
like image 92
Bad Boy Avatar answered Dec 13 '25 04:12

Bad Boy