I have the following table:
row_num customer_status effective_from_datetime
------- ------------------ -----------------------
1 Active 2011-01-01
2 Active 2011-01-02
3 Active 2011-01-03
4 Suspended 2011-01-04
5 Suspended 2011-01-05
6 Active 2011-01-06
And am trying to achieve the following result whereby consecutive rows with the same status are merged into one row with an effective from and to date range:
customer_status effective_from_datetime effective_to_datetime
--------------- ----------------------- ---------------------
Active 2011-01-01 2011-01-04
Suspended 2011-01-04 2011-01-06
Active 2011-01-06 NULL
I can get a recursive CTE to output the correct effective_to_datetime based on the next row, but am having trouble merging the ranges.
Code to generate sample data:
CREATE TABLE #temp
(
row_num INT IDENTITY(1,1),
customer_status VARCHAR(10),
effective_from_datetime DATE
)
INSERT INTO #temp
VALUES
('Active','2011-01-01')
,('Active','2011-01-02')
,('Active','2011-01-03')
,('Suspended','2011-01-04')
,('Suspended','2011-01-05')
,('Active','2011-01-06')
A recursive CTE references itself. It returns the result subset, then it repeatedly (recursively) references itself, and stops when it returns all the results.
A CTE can be recursive or non-recursive. A recursive CTE is a CTE that references itself. A recursive CTE can join a table to itself as many times as necessary to process hierarchical data in the table. CTEs increase modularity and simplify maintenance.
You can define the maximum number of recursions for CTE, using the MAXRECURSION option. Set the value of MAXRECURSION to 0, if you don't know the exact numbers of recursions.
And these recursive functions or stored procedures support only up-to 32 levels of recursion. By default CTEs support a maximum recursion level of 100. CTEs also provide an option to set a MAXRECURSION level value between 0 to 32,767.
EDIT SQL updated as per comment.
WITH
group_assigned_data AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY customer_status ORDER BY effective_from_date) AS status_sequence_id,
ROW_NUMBER() OVER ( ORDER BY effective_from_date) AS sequence_id,
customer_status,
effective_from_date
FROM
your_table
)
,
grouped_data AS
(
SELECT
customer_status,
MIN(effective_from_date) AS min_effective_from_date,
MAX(effective_from_date) AS max_effective_from_date
FROM
group_assigned_data
GROUP BY
customer_status,
sequence_id - status_sequence_id
)
SELECT
[current].customer_status,
[current].min_effective_from_date AS effective_from,
[next].min_effective_from_date AS effective_to
FROM
grouped_data AS [current]
LEFT JOIN
grouped_data AS [next]
ON [current].max_effective_from_date = [next].min_effective_from_date + 1
ORDER BY
[current].min_effective_from_date
This isn't recursive, but that's possibly a good thing.
It doesn't deal with gaps in your data. To deal with that you could create a calendar table, with every relevant date, and join on that to fill missing dates with 'unknown' status, and then run the query against that. (Infact you cate do it it a CTE that is used by the CTE above).
At present...
- If row 2 was missing, it would not change the result
- If row 3 was missing, the end_date of the first row would change
Different behaviour can be determined by preparing your data, or other methods. We'd need to know the business logic you need though.
If any one date can have multiple status entries, you need to define what logic you want it to follow. At present the behaviour is undefined, but you could correct that as simply as adding customer_status
to the ORDER BY
portions of ROW_NUMBER().
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With