I have data in following format.
match_id team_id won_ind
----------------------------
37 Team1 N
67 Team1 Y
98 Team1 N
109 Team1 N
158 Team1 Y
162 Team1 Y
177 Team1 Y
188 Team1 Y
198 Team1 N
207 Team1 Y
217 Team1 Y
10 Team2 N
13 Team2 N
24 Team2 N
39 Team2 Y
40 Team2 Y
51 Team2 Y
64 Team2 N
79 Team2 N
86 Team2 N
91 Team2 Y
101 Team2 N
Here match_id
s are in chronological order, 37 is the first and 217 is the last match played by team1. won_ind
indicated whether the team won the match or not.
So, from the above data, team1 has lost its first match, then won a match, then lost 2 matches, then won 4 consecutive matches and so on. Now I'm interested in finding the longest winning streak for each team.
Team_id longest_streak
------------------------
Team1 4
Team2 3
I know how to find this in plsql, but i was wondering if this can be calculated in pure SQL. I tried using LEAD, LAG and several other functions, but not getting anywhere.
I have created sample fiddle here.
with original_data as (
select 37 match_id, 'Team1' team_id, 'N' won_id from dual union all
select 67 match_id, 'Team1' team_id, 'Y' won_id from dual union all
select 98 match_id, 'Team1' team_id, 'N' won_id from dual union all
select 109 match_id, 'Team1' team_id, 'N' won_id from dual union all
select 158 match_id, 'Team1' team_id, 'Y' won_id from dual union all
select 162 match_id, 'Team1' team_id, 'Y' won_id from dual union all
select 177 match_id, 'Team1' team_id, 'Y' won_id from dual union all
select 188 match_id, 'Team1' team_id, 'Y' won_id from dual union all
select 198 match_id, 'Team1' team_id, 'N' won_id from dual union all
select 207 match_id, 'Team1' team_id, 'Y' won_id from dual union all
select 217 match_id, 'Team1' team_id, 'Y' won_id from dual union all
select 10 match_id, 'Team2' team_id, 'N' won_id from dual union all
select 13 match_id, 'Team2' team_id, 'N' won_id from dual union all
select 24 match_id, 'Team2' team_id, 'N' won_id from dual union all
select 39 match_id, 'Team2' team_id, 'Y' won_id from dual union all
select 40 match_id, 'Team2' team_id, 'Y' won_id from dual union all
select 51 match_id, 'Team2' team_id, 'Y' won_id from dual union all
select 64 match_id, 'Team2' team_id, 'N' won_id from dual union all
select 79 match_id, 'Team2' team_id, 'N' won_id from dual union all
select 86 match_id, 'Team2' team_id, 'N' won_id from dual union all
select 91 match_id, 'Team2' team_id, 'Y' won_id from dual union all
select 101 match_id, 'Team2' team_id, 'N' won_id from dual
),
----------------------------------------------------------------------
new_streaks as (
--
-- Identifying new streaks.
-- ------------------------
--
select
match_id,
team_id,
won_id,
--
-- A new streak is identfied if
--
case when
--
-- a) won_id = 'Y' and
--
won_id = 'Y' and
--
-- b) the previous won_id = 'N':
--
lag(won_id) over (partition by team_id order by match_id) = 'N'
--
--
then 1
--
-- All other cases: no new streak:
else 0
--
end new_streak
from
original_data
),
-------------------------------
streak_no as (
--
-- Assigning a unique number to each streak.
-- -----------------------------------------
--
select
--
match_id,
team_id,
--
-- In order to be able to count the number of records
-- of a streak, we first need to assign a unique number
-- to each streak:
--
sum(new_streak) over (partition by team_id order by match_id) streak_no
--
from
new_streaks
where
-- We're only interested in «winning streaks»:
won_id = 'Y'
),
-----------------------------------------------
--
-- Counting the elements per streak
-- --------------------------------
--
records_per_streak as (
select
count(*) counter,
team_id,
streak_no
from
streak_no
group by
team_id,
streak_no
)
------------------------------------------------
--
-- Finally: we can find the «longest streak»
-- per team:
--
select
max(counter) longest_streak,
team_id
from
records_per_streak
group by team_id
;
I had a similar task on Teradata, modified it to run on Oracle:
SELECT
team_id,
MAX(cnt)
FROM
(
SELECT
team_id,
COUNT(*) AS cnt
FROM
(
SELECT
team_id,
match_id,
won_ind,
SUM(CASE WHEN won_ind <> 'Y' THEN 1 END)
OVER (PARTITION BY team_id
ORDER BY match_id
ROWS UNBOUNDED PRECEDING) AS dummy
FROM matches
) dt
WHERE won_ind = 'Y'
GROUP BY team_id, dummy
) dt
GROUP BY team_id;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With