I'd like to know if it's possible to use window queries based on time in Postgresql.
The original data is in the first three columns (date, salesman, amount):
| date | salesman | amount | sum-3-rolling-days |
|---|---|---|---|
| 2020-01-01 | john | 10 | 10 |
| 2020-01-02 | john | 15 | 25 |
| 2020-01-03 | john | 8 | 33 |
| 2020-01-04 | john | 12 | 35 |
| 2020-01-05 | john | 11 | 31 |
| 2020-01-01 | daniel | 5 | 5 |
| 2020-01-02 | daniel | 6 | 11 |
| 2020-01-03 | daniel | 7 | 18 |
| 2020-01-04 | daniel | 8 | 21 |
| 2020-01-05 | daniel | 9 | 24 |
The fourth column represents the total amount by this salesman during the past three rolling days.
Pandas has built-in functions to do this, but I can't think of any way to do it in Postgresql using built-in sum() over () syntax. The only way I was able to is using a convoluted mix of lateral joins and subqueries with conditions on time delta comparisons, which is inelegant to say the least.
Pandas' way (by memory, exact syntax might differ slightly) — cannot get any terser:
df.groupby('salesman').rolling('3d').sum()
demos:db<>fiddle
SELECT
*,
SUM(amount) OVER (
PARTITION BY salesman -- 1
ORDER BY "date" -- 2
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW -- 3
)
FROM mytable
salesman. So, the following will only done within theses partitionsdate columnIf you are using Postgres 11 or later, you can define the window even more precisely using RANGE with date intervals instead of counting ROWS:
SELECT
*,
SUM(amount) OVER (
PARTITION BY salesman
ORDER BY "date"
RANGE BETWEEN interval '2 days' PRECEDING AND CURRENT ROW
)
FROM mytable
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With