Q: How can I rank records based on 1 column's changing value?
I have the following data (https://pastebin.com/vdTb1JRT):
EmployeeID Date Onleave
ABH12345 2016-01-01 0
ABH12345 2016-01-02 0
ABH12345 2016-01-03 0
ABH12345 2016-01-04 0
ABH12345 2016-01-05 0
ABH12345 2016-01-06 0
ABH12345 2016-01-07 0
ABH12345 2016-01-08 0
ABH12345 2016-01-09 0
ABH12345 2016-01-10 1
ABH12345 2016-01-11 1
ABH12345 2016-01-12 1
ABH12345 2016-01-13 1
ABH12345 2016-01-14 0
ABH12345 2016-01-15 0
ABH12345 2016-01-16 0
ABH12345 2016-01-17 0
I would like to produce the following results:
EmployeeID DateValidFrom DateValidTo OnLeave
ABH12345 2016-01-01 2016-01-09 0
ABH12345 2016-01-10 2016-01-13 1
ABH12345 2016-01-14 2016-01-17 0
So I'm thinking if I can somehow create a ranked column (like shown below) that increments based on the value in the Onleave column - partitioned by the EmployeeID column.
EmployeeID Date Onleave RankedCol
ABH12345 2016-01-01 0 1
ABH12345 2016-01-02 0 1
ABH12345 2016-01-03 0 1
ABH12345 2016-01-04 0 1
ABH12345 2016-01-05 0 1
ABH12345 2016-01-06 0 1
ABH12345 2016-01-07 0 1
ABH12345 2016-01-08 0 1
ABH12345 2016-01-09 0 1
ABH12345 2016-01-10 1 2
ABH12345 2016-01-11 1 2
ABH12345 2016-01-12 1 2
ABH12345 2016-01-13 1 2
ABH12345 2016-01-14 0 3
ABH12345 2016-01-15 0 3
ABH12345 2016-01-16 0 3
ABH12345 2016-01-17 0 3
Then I would be able to do the following:
SELECT
[EmployeeID] = [EmployeeID]
,[DateValidFrom] = MIN([Date])
,[DateValidTo] = MAX([Date])
,[OnLeave] = [OnLeave]
FROM table/view/cte/sub-query
GROUP BY
[EmployeeID]
,[OnLeave]
,[RankedCol]
Other solutions are very welcome..
Below is the test data :
WITH CTE AS ( SELECT EmployeeID = 'ABH12345', [Date] = CAST(N'2016-01-01' AS Date), [Onleave] = 0
UNION SELECT 'ABH12345', CAST(N'2016-01-02' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-03' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-04' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-05' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-06' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-07' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-08' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-09' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-10' AS Date), 1
UNION SELECT 'ABH12345', CAST(N'2016-01-11' AS Date), 1
UNION SELECT 'ABH12345', CAST(N'2016-01-12' AS Date), 1
UNION SELECT 'ABH12345', CAST(N'2016-01-13' AS Date), 1
UNION SELECT 'ABH12345', CAST(N'2016-01-14' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-15' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-16' AS Date), 0
UNION SELECT 'ABH12345', CAST(N'2016-01-17' AS Date), 0
)
SELECT * FROM CTE
The RANK() function creates a ranking of the rows based on a provided column. It starts with assigning “1” to the first row in the order and then gives higher numbers to rows lower in the order. If rows have the same value, they're ranked the same.
we can use rank function and group by in the same query set but all the columns should be contained in either aggregate function or the Group by clause.
To partition rows and rank them by their position within the partition, use the RANK() function with the PARTITION BY clause. SQL's RANK() function allows us to add a record's position within the result set or within each partition. In our example, we rank rows within a partition.
This order of operations implies that you can only use window functions in SELECT and ORDER BY . That is, window functions are not accessible in WHERE , GROUP BY , or HAVING clauses. For this reason, you cannot use any of these functions in WHERE : ROW_NUMBER() , RANK() , DENSE_RANK() , LEAD() , LAG() , or NTILE() .
One more way to do it with lag
. Assign groups by getting the previous Onleave value for each employeeid and resetting it when a different value is found.
select employeeid,min(date) as date_from,max(date) as date_to,max(onleave) as onleave
from (select t.*,sum(case when prev_ol=onleave then 0 else 1 end) over(partition by employeeid order by date) as grp
from (select c.*,lag(onleave,1,onleave) over(partition by employeeid order by date) as prev_ol
from cte c
) t
) t
group by employeeid,grp
Here is another, a bit simpler, way to get the desired output - accessing the table only once.
-- sample of data from your question
with t1(EmployeeID, Date1, Onleave) as(
select 'ABH12345', cast('2016-01-01' as date), 0 union all
select 'ABH12345', cast('2016-01-02' as date), 0 union all
select 'ABH12345', cast('2016-01-03' as date), 0 union all
select 'ABH12345', cast('2016-01-04' as date), 0 union all
select 'ABH12345', cast('2016-01-05' as date), 0 union all
select 'ABH12345', cast('2016-01-06' as date), 0 union all
select 'ABH12345', cast('2016-01-07' as date), 0 union all
select 'ABH12345', cast('2016-01-08' as date), 0 union all
select 'ABH12345', cast('2016-01-09' as date), 0 union all
select 'ABH12345', cast('2016-01-10' as date), 1 union all
select 'ABH12345', cast('2016-01-11' as date), 1 union all
select 'ABH12345', cast('2016-01-12' as date), 1 union all
select 'ABH12345', cast('2016-01-13' as date), 1 union all
select 'ABH12345', cast('2016-01-14' as date), 0 union all
select 'ABH12345', cast('2016-01-15' as date), 0 union all
select 'ABH12345', cast('2016-01-16' as date), 0 union all
select 'ABH12345', cast('2016-01-17' as date), 0
)
-- actual query
select max(w.employeeid) as employeeid
, min(w.date1) as datevalidfrom
, max(w.date1) as datevalidto
, max(w.onleave) as onleave
from (
select row_number() over(partition by employeeid order by date1) -
row_number() over(partition by employeeid, onleave order by date1) as grp
, employeeid
, date1
, onleave
from t1 s
) w
group by w.grp
order by employeeid, datevalidfrom
Result:
employeeid datevalidfrom datevalidto onleave
---------- ------------- ----------- -----------
ABH12345 2016-01-01 2016-01-09 0
ABH12345 2016-01-10 2016-01-13 1
ABH12345 2016-01-14 2016-01-17 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With