Say I have a table:
| date | fruit | status | numberOfFruits |
|---|---|---|---|
| 2022-01 | apple | ripe | 3 |
| 2022-01 | banana | mature | 5 |
| 2022-01 | pear | ripe | 10 |
| 2022-01 | grapes | mature | 9 |
| 2022-02 | apple | ripe | 3 |
| 2022-02 | banana | mature | 3 |
| 2022-02 | pear | ripe | 3 |
| 2022-02 | grapes | mature | 7 |
I want to be able to create a query that adds 3 aggregate columns (AVG, MIN, MAX) that aggregates the column numberOfFruits based on their status and date, ordered by fruit and date. The output of the table should be:
| date | fruit | status | numberOfFruits | AvgNumOfFruits | MaxNumOfFruits | MinNumOfFruits |
|---|---|---|---|---|---|---|
| 2022-01 | apple | ripe | 3 | 6.5 | 10 | 3 |
| 2022-01 | pear | ripe | 10 | 6.5 | 10 | 3 |
| 2022-01 | banana | mature | 5 | 7 | 9 | 5 |
| 2022-01 | grapes | mature | 9 | 7 | 9 | 5 |
| 2022-02 | apple | ripe | 3 | 3 | 3 | 3 |
| 2022-02 | pear | ripe | 3 | 3 | 3 | 3 |
| 2022-02 | banana | mature | 3 | 5 | 7 | 3 |
| 2022-02 | grapes | mature | 7 | 5 | 7 | 3 |
I'm at a loss and here's what I have so far:
SELECT
date,
fruit,
status,
numberOfFruits,
AVG(CASE WHEN "status" = 'ripe' THEN "numberOfFruits" ELSE "numberOfFruits" END) as AvgNumOfFruits,
MIN(CASE WHEN "status" = 'ripe' THEN "numberOfFruits" ELSE "numberOfFruits" END) as MingNumOfFruits,
MAX(CASE WHEN "status" = 'ripe' THEN "numberOfFruits" ELSE "numberOfFruits" END) as MaxNumOfFruits
FROM fruitdata
GROUP BY 1, 2, 3, 4
ORDER BY date, status
The query just repeats the numberofFruits for those 3 aggregate columns. It should be like "take the average of the column numberOfFruits that has the status of ripe, as well as the date 2022-01, and put it in a new column called AvgNumOfFruits. Somehow I can't translate this to SQL.
Any advice/tips/help would be appreciated. Thanks in advance!
Aggregation functions are employed to aggregate rows (combining rows to get aggregated values). In your case you should rather use window functions, that compute values over windows (partitions/groups of rows), though without aggregating the rows.
SELECT *,
AVG(numberOfFruits) OVER(PARTITION BY date, status) AS AvgNumOfFruits,
MAX(numberOfFruits) OVER(PARTITION BY date, status) AS MaxNumOfFruits,
MIN(numberOfFruits) OVER(PARTITION BY date, status) AS MinNumOfFruits
FROM fruitdata
ORDER BY date,
status DESC
The corresponding window functions need to be computed on groups of "date" and "status" fields. On the other hand, the ordering is completely optional and up to you (ORDER BY clause).
Check the demo here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With