I need to get a list of date ranges that are NOT overlapping with each other from a list of overlapping dates and get the sum of coins during that overlap. I have tried googling for an example but no luck so far. I might not be using the right key words?
I have a list of overlapping dates
1.1.2018 - 31.1.2018 80
7.1.2018 - 10.1.2018 10
7.1.2018 - 31.1.2018 10
11.1.2018 - 31.1.2018 5
25.1.2018 - 27.1.2018 5
2.2.2018 - 23.2.2018 100
Desired outcome would be
1.1.2018 - 6.7.2018 80 coins
7.1.2018 - 10.1.2018 100 coins
11.1.2018 - 24.1.2018 95 coins
25.1.2018 - 27.1.2018 100 coins
28.1.2018 - 31.1.2018 95 coins
2.2.2018 - 23.2.2018 100 coins
Here is a figure how it should work
|------------------------------|
|---|
|-----------------------|
|-------------------|
|---|
|----------------------|
Outcome
|------|---|----------|---|----| |----------------------|
80 100 95 100 95 100
This is my test data
drop table coinsonperiod2;
create table coinsonperiod2(
id serial,
startdate date,
enddate date,
coins integer,
userid integer
);
insert into coinsonperiod2 (startdate, enddate, coins,userid) values
('2018-01-01','2018-01-31', 80,1)
, ('2018-01-07','2018-01-10', 10,1)
, ('2018-01-07','2018-01-31', 10,1)
, ('2018-01-11','2018-01-31', 5,1)
, ('2018-01-25','2018-01-27', 5,1)
, ('2018-02-02','2018-02-23', 100,2)
, ('2018-01-01','2018-01-31', 80,2)
, ('2018-01-07','2018-01-10', 10,2)
, ('2018-01-07','2018-01-31', 10,2)
, ('2018-01-11','2018-01-31', 5,2)
, ('2018-01-25','2018-01-27', 5,2)
, ('2018-02-02','2018-02-23', 100,3)
;
UPDATE: Actually StephenM's and joops answers do not meet my desired outcome. Both answers show enddate wrong.
When one period ends the next should start next day (or later if there is a gap). In my desired outcome 1.1.2018-6.1.2018 includes the 6th day. There is no gap between 6th and 7th because 7th is included in 7.1.2018-10.1.2018.
UPDATE2: Now I understood what is the difference between open, half open and closed intervals. In joops solution, calculation must be done against half open intervals, but my desired outcome is closed interval. That is why enddate must be reduced to make the outcome as closed interval. Correct me if I am wrong.
I also added userid in the sample data and modified joops solution some more. Here is the query that gives me my desired outcome.
with changes AS (
SELECT
userid,
startdate AS tickdate,
coins,
1 AS cover
FROM coinsonperiod2
UNION ALL
-- add 1 day to correct intervals into half open intervals, so the calculation is correct
SELECT
userid,
1 + enddate AS tickdate,
-1 * coins,
-1 AS cover
FROM coinsonperiod2
)
, sumchanges AS (
SELECT
userid,
tickdate,
SUM(coins) AS change,
SUM(cover) AS cover
FROM changes
GROUP BY tickdate, userid
)
, aggregated AS (
SELECT
userid AS userid,
tickdate AS startdate,
lead(tickdate)
over www AS enddate,
sum(change)
OVER www AS cash,
sum(cover)
OVER www AS cover
FROM sumchanges
WINDOW www AS (
partition by userid
ORDER BY tickdate )
)
-- reduce 1 day from the enddate to make closed interval
SELECT
userid
, startdate
, enddate-1 as enddate
, cash
, cover
FROM aggregated
WHERE cover > 0
ORDER BY userid, startdate
;
Outcome:
The logic is:
So the point is: convert the data from a series of intervals to a series of (start/stop) events, and aggregate over these.
-- \i tmp.sql
create table coinsonperiod(
id serial,
startdate date,
enddate date,
coins integer
);
insert into coinsonperiod (startdate, enddate, coins) values
('2018-01-01','2018-01-31', 80)
, ('2018-01-07','2018-01-10', 10)
, ('2018-01-07','2018-01-31', 10)
, ('2018-01-11','2018-01-31', 5)
, ('2018-01-25','2018-01-27', 5)
, ('2018-02-02','2018-02-23', 100)
;
WITH changes AS (
SELECT startdate AS tickdate , coins
, 1 AS cover
FROM coinsonperiod
UNION ALL
-- add 1 day to convert to half-open intervals
SELECT 1+enddate AS tickdate, -1* coins
, -1 AS cover
FROM coinsonperiod
)
, sumchanges AS (
SELECT tickdate, SUM(coins) AS change, SUM(cover) AS cover
FROM changes
GROUP BY tickdate
)
, aggregated AS (
SELECT
tickdate AS startdate
, lead(tickdate) over www AS enddate
, sum(change) OVER www AS cash
-- number of covered intervals
, sum(cover) OVER www AS cover
FROM sumchanges
WINDOW www AS (ORDER BY tickdate)
)
-- substract one day from enddate to correct back to closed intervals
SELECT startdate, enddate-1 AS enddate, cash, cover
FROM aggregated
WHERE cover > 0
ORDER BY startdate
;
Looks like I found an ugly one that works
select t1.dt, t1.enddt, sum(coins)
from (
select distinct cp1.dt, min(cp2.dt) enddt
from ( select startdate as dt from coinsonperiod union all select enddate as dt from coinsonperiod ) cp1,
( select startdate as dt from coinsonperiod union all select enddate as dt from coinsonperiod ) cp2
where cp2.dt > cp1.dt
group by cp1.dt
order by cp1.dt ) t1, coinsonperiod t2
where t1.dt between t2.startdate and t2.enddate
and t1.enddt between t2.startdate and t2.enddate
group by t1.dt, t1.enddt
Output:
dt |enddt |sum |
-----------|-----------|----|
2018-01-01 |2018-01-07 |80 |
2018-01-07 |2018-01-10 |100 |
2018-01-10 |2018-01-11 |90 |
2018-01-11 |2018-01-25 |95 |
2018-01-25 |2018-01-27 |100 |
2018-01-27 |2018-01-31 |95 |
2018-02-02 |2018-02-23 |100 |
Only difference with your output is that I suppose you forgot the interval between 01/10 and 01/11
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With