I am aggregating sales for a set of products per day using Postgres and need to know not just when sales do happen, but also when they do not for further processing.
SELECT
sd.date,
COUNT(sd.sale_id) AS sales,
sd.product
FROM sales_data sd
-- sales per product, per day
GROUP BY sd.product, sd.date
ORDER BY sd.product, sd.date
This produces the following:
date | sales | product
------------+-------+-------------------
2017-08-17 | 10 | soap
2017-08-19 | 2 | soap
2017-08-20 | 5 | soap
2017-08-17 | 2 | shower gel
2017-08-21 | 1 | shower gel
As you can see - the date ranges per product are not continuous as sales_data
just didn't contain any info for these products on some days.
What I'm aiming to do is to add a sales = 0
row for each product that is not sold on any day in a range - for example here, between 2017-08-17
and 2017-08-21
to give something like the the following:
date | sales | product
------------+-------+-------------------
2017-08-17 | 10 | soap
2017-08-18 | 0 | soap
2017-08-19 | 2 | soap
2017-08-20 | 5 | soap
2017-08-21 | 0 | soap
2017-08-17 | 2 | shower gel
2017-08-18 | 0 | shower gel
2017-08-19 | 0 | shower gel
2017-08-20 | 0 | shower gel
2017-08-21 | 1 | shower gel
In a simpler case where there was only a single product, it seems like the solution would be to use generate_series()
i.e.:
LEFT JOIN
the already aggregated sales data onto the date seriesCOALESCE
any NULL
counts to 0 in the missing rowsThe problem I have is that this approach does not seem to work dates repeat in the aggregated data as I'm grouping over not just multiple dates, but multiple products also.
It feels like I should be able to do something cunning with window functions here to solve this e.g. joining onto the full date range over partitions defined by the product name - but I can't see a way of actually getting this to work.
You could use:
WITH cte AS (
SELECT date, s.product
FROM ... -- some way to generate date series
CROSS JOIN (SELECT DISTINCT product FROM sales_data) s
)
SELECT
c.date,
c.product,
COUNT(sd.sale_id) AS sales
FROM cte c
LEFT JOIN sales_data sd
ON c.date = sd.date AND c.product= sd.product
GROUP BY c.date, c.product
ORDER BY c.date, c.product;
First create Cartesian product of dates and products, then LEFT JOIN
to actual data and do calculations.
Oracle has great feature for this scenarios called Partitioned Outer Joins:
SELECT times.time_id, product, quantity
FROM inventory PARTITION BY (product)
RIGHT OUTER JOIN times ON (times.time_id = inventory.time_id)
WHERE times.time_id BETWEEN TO_DATE('01/04/01', 'DD/MM/YY')
AND TO_DATE('06/04/01', 'DD/MM/YY')
ORDER BY 2,1;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With