Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server - group by with row number - Gaps and Islands

I have a table with data like this

+-------------+--------------+------------+----------------+
| CustomerSID | StartDateSID | EndDateSID | MarketingOptIn |
+-------------+--------------+------------+----------------+
|       12345 |     20101019 |   20131016 | Y              |
|       12345 |     20131017 |   20140413 | Y              |
|       12345 |     20140414 |   20140817 | N              |
|       12345 |     20140818 |   20141228 | N              |
|       12345 |     20141229 |   20150125 | Y              |
|       12345 |     20150126 |          0 | Y              |
+-------------+--------------+------------+----------------+

I need to create a view on top of this table to have data formatted in the below format for the Flag, basically the duration for which the Flag was Y or N. (EndDateSID - 0 is currently active, so today's date)

+-------------+--------------+------------+----------------+
| CustomerSID | StartDateSID | EndDateSID | MarketingOptIn |
+-------------+--------------+------------+----------------+
|       12345 |     20101019 |   20140413 | Y              |
|       12345 |     20140414 |   20141228 | N              |
|       12345 |     20141229 |   20150825 | Y              |
+-------------+--------------+------------+----------------+

Most customers only have a change in their Flag once, hence below query works:

SELECT 
CH1.CustomerSID
,MIN(CH1.StartDateSID) StartDate
,MAX(ISNULL(NULLIF(CH1.EndDateSID,0),CONVERT(INT, CONVERT(VARCHAR, GETDATE(), 112)))) EndDate
,CH1.MarketingOptIn 
FROM DWH.DimCustomerHistory CH1
GROUP BY CH1.CustomerSID, CH1.MarketingOptIn
ORDER BY CH1.CustomerSID, CH1.MarketingOptIn

How can I achieve the intended output for customers like the one above, having changes in the flag more than once?

like image 846
Rohit P Avatar asked Dec 25 '22 15:12

Rohit P


1 Answers

You can use the following query:

SELECT CustomerSID,
       MIN(StartDateSID) AS StartDate,
       MAX(ISNULL(NULLIF(EndDateSID,0),
           CONVERT(INT, CONVERT(VARCHAR, GETDATE(), 112)))) AS EndDate,
       MarketingOptIn
FROM (       
  SELECT CustomerSID, StartDateSID, EndDateSID, MarketingOptIn,
         ROW_NUMBER() OVER (ORDER BY StartDateSID) - 
         ROW_NUMBER() OVER (PARTITION BY CustomerSID, MarketingOptIn 
                            ORDER BY StartDateSID) AS grp     
  FROM DimCustomerHistory ) AS t
GROUP BY CustomerSID, MarketingOptIn, grp
ORDER BY StartDate

Calculated field grp serves to identify consecutive records having the same MarketingOptIn value.

Utilizing this field in an outer query, you can easily GROUP BY and apply MIN and MAX aggregate functions in a way similar to your original query.

Demo here

like image 116
Giorgos Betsos Avatar answered Dec 27 '22 05:12

Giorgos Betsos