Situation:
I have three tables. Table 1 has Ids and subscribed date. Table 2 has Ids, activity status and the most recent date of the activity status change. Table 3 has Ids and all the logs for status changes. Note: On the Subscribed date, all IDs are active. When there are more than one status change in one day, the most recent one is the one to choose.
Objective:
I need to figure out the # of IDs for each status on each day. I.e. How many people were active, inactive and risky every single day. My issue is making sure that the status of an ID is counted everyday even if there are no data on a specific day. For example: ID 1 (see fiddle below) has been active since May 2nd (joindate) and had no status change, so he should be counted as active every single day until now.
After consulting this problem elsewhere, some have proposed to create function and a cross apply and store the counts in a table. I do not have the skills to do so but would that be an option to solve this issue?
Desired Output:
+------------+----------+-------+
| date | status | count |
+------------+----------+-------+
| 1-May-2019 | active | 0 |
| 1-May-2019 | inactive | 0 |
| 1-May-2019 | risky | 1 |
| 2-May-2019 | active | 1 |
| 2-May-2019 | inactive | 0 |
| 2-May-2019 | risky | 1 |
| 3-May-2019 | active | 1 |
| 3-May-2019 | inactive | 0 |
| 3-May-2019 | risky | 1 |
| 4-May-2019 | active | 1 |
| 4-May-2019 | inactive | 0 |
| 4-May-2019 | risky | 1 |
| 5-May-2019 | active | 3 |
| 5-May-2019 | inactive | 0 |
| 5-May-2019 | risky | 1 |
| ... | ... | ... |
+------------+----------+-------+
Fiddle:
--create date table (not sure if usable)
CREATE TABLE #dates ([date] date)
DECLARE @dIncr DATE = '2019-05-01'
DECLARE @dEnd DATE = dateadd(day,-1,getdate())
WHILE (@dIncr <= @dEnd)
BEGIN
INSERT INTO #dates ([date]) VALUES (@dIncr)
SELECT @dIncr = DATEADD(day,1,@dIncr)
END
GO
-- ID + Subscribed Date (starts active at joindate)
create table #t1 (id int, [subdate] date)
insert into #t1 values
(9, '2019-01-01'),
(1, '2019-05-02'),
(2, '2019-05-05'),
(3, '2019-05-05'),
(4, '2019-05-10')
GO
-- ID + Latest activity date
create table #t2 (id int, [status] varchar(max), [datestatus] date)
insert into #t2 values
(9,'risky', '2019-03-01'),
(1, 'active', '2019-05-02'),
(2, 'inactive', '2019-05-13'),
(3, 'active', '2019-05-14'),
(4, 'risky', '2019-05-15')
GO
-- ID + Activity Logs Date
create table #t3 (id int, [statuschange] varchar(max), [datechange] date)
insert into #t3 values
(9,'inactive', '2019-01-01'),
(9,'active', '2019-02-01'),
(9,'risky', '2019-03-01'),
(2, 'risky', '2019-05-08'),
(2, 'inactive', '2019-05-13'),
(3, 'inactive', '2019-05-08'),
(3, 'active', '2019-05-14'),
(4, 'inactive', '2019-05-15'),
(4, 'risky', '2019-05-15')
GO
What I have for now:
;with cte as (
select
#t1.id
,COALESCE(LAG(datechange) over(partition by #t1.id order by datechange),subdate) as StartDate
,#t3.datechange
,COALESCE(LAG(statuschange) over(partition by #t1.id order by datechange),'active') as PreviousStatusChange
,#t3.statuschange
from #t1
inner join #t2 on #t1.id=#t2.id
left join #t3 on #t1.id=#t3.id
)
select
cte.id
,cte.StartDate
,coalesce(cte.datechange,'2099-01-01') as EndDate
,PreviousStatusChange
,coalesce(statuschange,previousstatuschange) AS NewStatus
from cte
select count(*) as counter from table1 where registration_date <= '2005-01-31' AND registration_date >= '2005-01-01'; select count(*) as counter from table1 where registration_date <= '2005-02-39' AND registration_date >= '2005-02-01'; etc... ...
The simple answer is no – there is no difference at all. The COUNT(*) function counts the total rows in the table, including the NULL values. The semantics for COUNT(1) differ slightly; we'll discuss them later. However, the results for COUNT(*) and COUNT(1) are identical.
SELECT COUNT(CASE WHEN status=1 THEN 1 ELSE NULL END) as '1', COUNT(CASE WHEN status=2 THEN 1 ELSE NULL END) as '2', COUNT(CASE WHEN status=3 THEN 1 ELSE NULL END) as '3', COUNT(CASE WHEN status=4 THEN 1 ELSE NULL END) as '4' FROM TICKET; But then you will have to keep your SQL updated with any new status codes.
In general, you should always use COUNT(*). This is because the database can often count rows by accessing an index, which is much faster than accessing a table. If you use COUNT(column), the database must actually inspect the individual values in the column, since it will not count NULLs.
The dates table is the right way to go for this. You need seed data to get your desired output. I opened up your dates table so the older subscriber would fill in.
I also added a status table because your output requirement needs one row for every date for each status.
DROP TABLE IF EXISTS #dates
CREATE TABLE #dates ([date] date)
DECLARE @dIncr DATE = '01/01/2019'
DECLARE @dEnd DATE = dateadd(day,-1,getdate())
WHILE (@dIncr <= @dEnd)
BEGIN
INSERT INTO #dates ([date]) VALUES (@dIncr)
SELECT @dIncr = DATEADD(day,1,@dIncr)
END
GO
DROP TABLE IF EXISTS #status
CREATE TABLE #status (status varchar(20))
INSERT INTO #status VALUES
('active'),
('inactive'),
('risky')
GO
DROP TABLE IF EXISTS #t1
create table #t1 (id int, [subdate] date)
insert into #t1 values
(9, '2019-01-01'),
(1, '2019-05-02'),
(2, '2019-05-05'),
(3, '2019-05-05'),
(4, '2019-05-10')
GO
DROP TABLE IF EXISTS #t2
create table #t2 (id int, [status] varchar(max), [datestatus] date)
insert into #t2 values
(9,'risky', '2019-03-01'),
(1, 'active', '2019-05-02'),
(2, 'inactive', '2019-05-13'),
(3, 'active', '2019-05-14'),
(4, 'risky', '2019-05-15')
GO
DROP TABLE IF EXISTS #t3
create table #t3 (id int, [statuschange] varchar(max), [datechange] date)
insert into #t3 values
(9,'inactive', '2019-01-01'),
(9,'active', '2019-02-01'),
(9,'risky', '2019-03-01'),
(2, 'risky', '2019-05-08'),
(2, 'inactive', '2019-05-13'),
(3, 'inactive', '2019-05-08'),
(3, 'active', '2019-05-14'),
(4, 'inactive', '2019-05-15'),
(4, 'risky', '2019-05-15')
GO
DECLARE
@From DATE
, @Thru DATE;
SET @From = '05/01/2019';
SET @Thru = '05/19/2019';
WITH
output_foundation AS
(
SELECT date, status
FROM #dates CROSS JOIN #status
)
, id_foundation AS
(
SELECT DISTINCT id, date
FROM #t1 CROSS JOIN #Dates
)
, id_stat AS
(
SELECT id, datechange, statuschange FROM #t3
UNION
SELECT id, subdate, 'active' FROM #t1
UNION
SELECT id, datestatus, status FROM #t2
)
, id_spread AS
(
SELECT
IFDN.id
, IFDN.date
, IDS.statuschange
FROM
id_foundation AS IFDN
LEFT OUTER JOIN id_stat AS IDS
ON IFDN.id = IDS.id
AND IFDN.date = IDS.datechange
), id_fill AS
(
SELECT
IDS.id
, IDS.date
, COALESCE(IDS.statuschange, LS.statuschange) AS statuschange
FROM
id_spread AS IDS
OUTER APPLY
(
SELECT TOP 1 statuschange
FROM id_spread
WHERE id = IDS.id AND date < IDS.date AND statuschange IS NOT NULL
ORDER BY date DESC
) AS LS
WHERE
(IDS.statuschange IS NOT NULL OR LS.statuschange IS NOT NULL)
)
SELECT
OFDN.date
, OFDN.status
, COUNT(statuschange) AS count
FROM
output_foundation AS OFDN
LEFT OUTER JOIN id_fill AS IDF
ON OFDN.date = IDF.date
AND OFDN.status = IDF.statuschange
WHERE
OFDN.date >= @From
AND OFDN.date <= @Thru
GROUP BY
OFDN.date
, OFDN.status
ORDER BY
OFDN.date
, OFDN.status;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With