Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count status Id for each day

Tags:

sql

sql-server

Situation:

I have three tables. Table 1 has Ids and subscribed date. Table 2 has Ids, activity status and the most recent date of the activity status change. Table 3 has Ids and all the logs for status changes. Note: On the Subscribed date, all IDs are active. When there are more than one status change in one day, the most recent one is the one to choose.

Objective:

I need to figure out the # of IDs for each status on each day. I.e. How many people were active, inactive and risky every single day. My issue is making sure that the status of an ID is counted everyday even if there are no data on a specific day. For example: ID 1 (see fiddle below) has been active since May 2nd (joindate) and had no status change, so he should be counted as active every single day until now.

After consulting this problem elsewhere, some have proposed to create function and a cross apply and store the counts in a table. I do not have the skills to do so but would that be an option to solve this issue?

Desired Output:

+------------+----------+-------+
|    date    |  status  | count |
+------------+----------+-------+
| 1-May-2019 | active   |     0 |
| 1-May-2019 | inactive |     0 |
| 1-May-2019 | risky    |     1 |
| 2-May-2019 | active   |     1 |
| 2-May-2019 | inactive |     0 |
| 2-May-2019 | risky    |     1 |
| 3-May-2019 | active   |     1 |
| 3-May-2019 | inactive |     0 |
| 3-May-2019 | risky    |     1 |
| 4-May-2019 | active   |     1 |
| 4-May-2019 | inactive |     0 |
| 4-May-2019 | risky    |     1 |
| 5-May-2019 | active   |     3 |
| 5-May-2019 | inactive |     0 |
| 5-May-2019 | risky    |     1 |
| ...        | ...      |   ... |
+------------+----------+-------+

Fiddle:

--create date table (not sure if usable)
CREATE TABLE #dates ([date] date)
DECLARE @dIncr DATE = '2019-05-01'
DECLARE @dEnd DATE = dateadd(day,-1,getdate())
WHILE (@dIncr <= @dEnd)
BEGIN
  INSERT INTO #dates ([date]) VALUES (@dIncr)
  SELECT @dIncr = DATEADD(day,1,@dIncr)
END
GO

-- ID + Subscribed Date (starts active at joindate)
create table #t1 (id int, [subdate] date)
insert into #t1 values 
(9, '2019-01-01'),
(1, '2019-05-02'),
(2, '2019-05-05'),
(3, '2019-05-05'),
(4, '2019-05-10')
GO

-- ID + Latest activity date
create table #t2 (id int, [status] varchar(max), [datestatus] date)
insert into #t2 values 
(9,'risky', '2019-03-01'),
(1, 'active', '2019-05-02'),
(2, 'inactive', '2019-05-13'),
(3, 'active', '2019-05-14'),
(4, 'risky', '2019-05-15')
GO

-- ID + Activity Logs Date
create table #t3 (id int, [statuschange] varchar(max), [datechange] date)
insert into #t3 values 
(9,'inactive', '2019-01-01'),
(9,'active', '2019-02-01'),
(9,'risky', '2019-03-01'),
(2, 'risky', '2019-05-08'),
(2, 'inactive', '2019-05-13'),
(3, 'inactive', '2019-05-08'),
(3, 'active', '2019-05-14'),
(4, 'inactive', '2019-05-15'),
(4, 'risky', '2019-05-15')
GO

What I have for now:

;with cte as (
    select 
        #t1.id
        ,COALESCE(LAG(datechange) over(partition by #t1.id order by datechange),subdate) as StartDate
        ,#t3.datechange
        ,COALESCE(LAG(statuschange) over(partition by #t1.id order by datechange),'active') as PreviousStatusChange
        ,#t3.statuschange
    from #t1
    inner join #t2 on #t1.id=#t2.id
    left join #t3 on #t1.id=#t3.id
) 

        select 
            cte.id
            ,cte.StartDate
            ,coalesce(cte.datechange,'2099-01-01') as EndDate
            ,PreviousStatusChange
            ,coalesce(statuschange,previousstatuschange) AS NewStatus
        from cte 
like image 736
Roger Steinberg Avatar asked May 16 '19 15:05

Roger Steinberg


People also ask

How do I count per day in SQL?

select count(*) as counter from table1 where registration_date <= '2005-01-31' AND registration_date >= '2005-01-01'; select count(*) as counter from table1 where registration_date <= '2005-02-39' AND registration_date >= '2005-02-01'; etc... ...

Which is faster count 1 or count (*)?

The simple answer is no – there is no difference at all. The COUNT(*) function counts the total rows in the table, including the NULL values. The semantics for COUNT(1) differ slightly; we'll discuss them later. However, the results for COUNT(*) and COUNT(1) are identical.

How do you count by status in SQL?

SELECT COUNT(CASE WHEN status=1 THEN 1 ELSE NULL END) as '1', COUNT(CASE WHEN status=2 THEN 1 ELSE NULL END) as '2', COUNT(CASE WHEN status=3 THEN 1 ELSE NULL END) as '3', COUNT(CASE WHEN status=4 THEN 1 ELSE NULL END) as '4' FROM TICKET; But then you will have to keep your SQL updated with any new status codes.

Should I use count (*)?

In general, you should always use COUNT(*). This is because the database can often count rows by accessing an index, which is much faster than accessing a table. If you use COUNT(column), the database must actually inspect the individual values in the column, since it will not count NULLs.


1 Answers

The dates table is the right way to go for this. You need seed data to get your desired output. I opened up your dates table so the older subscriber would fill in.

I also added a status table because your output requirement needs one row for every date for each status.

DROP TABLE IF EXISTS #dates
CREATE TABLE #dates ([date] date)
DECLARE @dIncr DATE = '01/01/2019'
DECLARE @dEnd DATE = dateadd(day,-1,getdate())
WHILE (@dIncr <= @dEnd)
BEGIN
  INSERT INTO #dates ([date]) VALUES (@dIncr)
  SELECT @dIncr = DATEADD(day,1,@dIncr)
END
GO

DROP TABLE IF EXISTS #status
CREATE TABLE #status (status varchar(20))
INSERT INTO #status VALUES
('active'),
('inactive'),
('risky')
GO

DROP TABLE IF EXISTS #t1
create table #t1 (id int, [subdate] date)
insert into #t1 values 
(9, '2019-01-01'),
(1, '2019-05-02'),
(2, '2019-05-05'),
(3, '2019-05-05'),
(4, '2019-05-10')
GO

DROP TABLE IF EXISTS #t2
create table #t2 (id int, [status] varchar(max), [datestatus] date)
insert into #t2 values 
(9,'risky', '2019-03-01'),
(1, 'active', '2019-05-02'),
(2, 'inactive', '2019-05-13'),
(3, 'active', '2019-05-14'),
(4, 'risky', '2019-05-15')
GO

DROP TABLE IF EXISTS #t3
create table #t3 (id int, [statuschange] varchar(max), [datechange] date)
insert into #t3 values 
(9,'inactive', '2019-01-01'),
(9,'active', '2019-02-01'),
(9,'risky', '2019-03-01'),
(2, 'risky', '2019-05-08'),
(2, 'inactive', '2019-05-13'),
(3, 'inactive', '2019-05-08'),
(3, 'active', '2019-05-14'),
(4, 'inactive', '2019-05-15'),
(4, 'risky', '2019-05-15')
GO

DECLARE
    @From DATE
    , @Thru DATE;

SET @From = '05/01/2019';
SET @Thru = '05/19/2019';

WITH
output_foundation AS
(
    SELECT date, status
    FROM #dates CROSS JOIN #status
)
, id_foundation AS
(
    SELECT DISTINCT id, date
    FROM #t1 CROSS JOIN #Dates
)
, id_stat AS
(
    SELECT id, datechange, statuschange FROM #t3
    UNION
    SELECT id, subdate, 'active' FROM #t1
    UNION
    SELECT id, datestatus, status FROM #t2
)
, id_spread AS
(
    SELECT
        IFDN.id
        , IFDN.date
        , IDS.statuschange
    FROM
        id_foundation AS IFDN
        LEFT OUTER JOIN id_stat AS IDS
            ON IFDN.id = IDS.id
                AND IFDN.date = IDS.datechange
), id_fill AS
(
    SELECT
        IDS.id
        , IDS.date
        , COALESCE(IDS.statuschange, LS.statuschange) AS statuschange
    FROM
        id_spread AS IDS
        OUTER APPLY
        (
            SELECT TOP 1 statuschange
            FROM id_spread
            WHERE id = IDS.id AND date < IDS.date AND statuschange IS NOT NULL
            ORDER BY date DESC
        ) AS LS
    WHERE
        (IDS.statuschange IS NOT NULL OR LS.statuschange IS NOT NULL)
)

SELECT
    OFDN.date
    , OFDN.status
    , COUNT(statuschange) AS count
FROM
    output_foundation AS OFDN
    LEFT OUTER JOIN id_fill AS IDF
        ON OFDN.date = IDF.date
            AND OFDN.status = IDF.statuschange
WHERE
    OFDN.date >= @From
    AND OFDN.date <= @Thru
GROUP BY
    OFDN.date
    , OFDN.status
ORDER BY
    OFDN.date
    , OFDN.status;
like image 147
Chris Albert Avatar answered Oct 07 '22 16:10

Chris Albert