This feels like it should be an easy one. How do I get the latest dates that are in different columns
DROP TABLE #indebtedness
CREATE TABLE #indebtedness (call_case CHAR(10), date1 DATETIME, date2 DATETIME, date3 DATETIME)
INSERT #indebtedness VALUES ('Key1', '2019-10-30', '2019-11-30', '2019-10-25')
INSERT #indebtedness VALUES ('Key2', '2019-10-20', '2019-10-30', '2019-10-15')
INSERT #indebtedness VALUES ('Key3', '2019-11-11', '2019-10-29', '2019-10-30')
INSERT #indebtedness VALUES ('Key4', null , '2019-10-29', '2019-10-13')
select call_case, ?? AS 'Latest Date' from #indebtedness
I would like the result to be:
call_case Latest Date
Key1 2019-11-30
Key2 2019-10-30
Key3 2019-11-11
Key4 2019-10-29
Use a CASE
expression:
SELECT
call_case,
CASE WHEN date1 > date2 AND date1 > date3
THEN date1
WHEN date2 > date3
THEN date2
ELSE date3 END AS [Latest Date]
FROM #indebtedness;
Note that some databases, such as MySQL, SQL Server, and SQLite, support a scalar greatest function. SQL Server does not, so we can use a CASE
expression as a workaround.
Edit:
It appears that in your actual table, one or more of the three date columns could have NULL
values. We can adapt the above query as follows:
SELECT
call_case,
CASE WHEN (date1 > date2 OR date2 IS NULL) AND (date1 > date3 OR date3 IS NULL)
THEN date1
WHEN date2 > date3 OR date3 IS NULL
THEN date2
ELSE date3 END AS [Latest Date]
FROM #indebtedness;
The currently accepted answer is the best answer, but I don't think it does a good enough job of explaining why. The other answers certainly look much cleaner at a glance (who wants to write that ugly case statement), but are likely to be much worse when you start operating at scale.
SELECT @@VERSION
Microsoft SQL Server 2016 (SP2) (KB4052908) - 13.0.5026.0 (X64)
Mar 18 2018 09:11:49
Copyright (c) Microsoft Corporation
Developer Edition (64-bit) on Windows 10 Enterprise 10.0 <X64> (Build 17763: )
Here is how I set everything up
DECLARE @Offset bigint = 0;
DECLARE @Max bigint = 10000000;
DROP TABLE IF EXISTS #Indebtedness;
CREATE TABLE #Indebtedness
(
call_case char(10) COLLATE DATABASE_DEFAULT NOT NULL,
date1 datetime NULL,
date2 datetime NULL,
date3 datetime NULL
);
WHILE @Offset < @Max
BEGIN
INSERT INTO #Indebtedness
( call_case, date1, date2, date3 )
SELECT @Offset + ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL )),
DATEADD( DAY,
CASE WHEN RAND() > 0 THEN 1
ELSE -1 END * ROUND( RAND(), 0 ),
CURRENT_TIMESTAMP ),
DATEADD( DAY,
CASE WHEN RAND() > 0 THEN 1
ELSE -1 END * ROUND( RAND(), 0 ),
CURRENT_TIMESTAMP ),
DATEADD( DAY,
CASE WHEN RAND() > 0 THEN 1
ELSE -1 END * ROUND( RAND(), 0 ),
CURRENT_TIMESTAMP )
FROM master.dbo.spt_values a
CROSS APPLY master.dbo.spt_values b;
SET @Offset = @Offset + ROWCOUNT_BIG();
END;
On my system this gets me 12,872,738 rows in the table. If I try each of the above queries (tweaked to SELECT INTO
so I don't need to wait for it to finish printing the results in SSMS), I get the following results:
Method | CPU time (ms) | Elapsed time (ms) | Relative Cost
-----------------------------------------------------------------------------------------
Tim Biegeleisen (CASE) | 13485 | 2167 | 2%
Red Devil (Subquery over MAX columns) | 55187 | 9891 | 14%
Vignesh Kumar (Subquery over columns) | 33750 | 5139 | 5%
Serkan Arslan (UNPIVOT) | 86205 | 15023 | 12%
Metal (STRING_SPLIT) | 459668 | 186742 | 68%
If you look at the query plans, it becomes pretty obvious why - adding any kind of unpivot or aggregate (or heaven forbid STRING_SPLIT
) you'll end up with all sorts of additional operators that you don't need (and it forces the plan to go parallel, taking away resources other queries might want). By contract, the CASE
based solution does not go parallel, runs very quickly, and is incredibly simple.
In this case, unless you have unlimited resources (you don't), you should pick the simplest and fastest approach.
There was a question of what to do if you need to keep adding new columns and expanding the case statement. Yes, this gets unwieldy, but so does every other solution. If this is actually a plausible workflow, then you should redesign your table. What you want probably looks something like this:
CREATE TABLE #Indebtedness2
(
call_case char(10) COLLATE DATABASE_DEFAULT NOT NULL,
activity_type bigint NOT NULL, -- This indicates which date# column it was, if you care
timestamp datetime NOT NULL
);
SELECT Indebtedness.call_case,
Indebtedness.activity_type,
Indebtedness.timestamp
FROM ( SELECT call_case,
activity_type,
timestamp,
ROW_NUMBER() OVER ( PARTITION BY call_case
ORDER BY timestamp DESC ) RowNumber
FROM #Indebtedness2 ) Indebtedness
WHERE Indebtedness.RowNumber = 1;
This is certainly not free of potential performance issues, and will require careful index tuning, but is the best way to handle an arbitrary number of potential timestamps
In case any answers get deleted, here are the versions I was comparing (in order)
SELECT
call_case,
CASE WHEN date1 > date2 AND date1 > date3
THEN date1
WHEN date2 > date3
THEN date2
ELSE date3 END AS [Latest Date]
FROM #indebtedness;
SELECT call_case,
(SELECT Max(v)
FROM (VALUES (date1), (date2), (date3),...) AS value(v)) as [MostRecentDate]
FROM #indebtedness
SELECT call_case,
(SELECT
MAX(call_case)
FROM ( VALUES
(MAX(date1)),
(MAX(date2))
,(max(date3))
) MyAlias(call_case)
)
FROM #indebtedness
group by call_case
select call_case, MAX(date) [Latest Date] from #indebtedness
UNPIVOT(date FOR col IN ([date1], [date2], [date3])) UNPVT
GROUP BY call_case
select call_case , max(cast(x.Item as date)) as 'Latest Date' from #indebtedness t
cross apply dbo.SplitString(concat(date1, ',', date2, ',', date3), ',') x
group by call_case
Try this:
SELECT call_case,
(SELECT
MAX(call_case)
FROM ( VALUES
(MAX(date1)),
(MAX(date2))
,(max(date3))
) MyAlias(call_case)
)
FROM #indebtedness
group by call_case
SQL FIDDLE
Use MAX()
SELECT call_case,
(SELECT Max(v)
FROM (VALUES (date1), (date2), (date3),...) AS value(v)) as [MostRecentDate]
FROM #indebtedness
Use CASE
SELECT
CASE
WHEN Date1 >= Date2 AND Date1 >= Date3 THEN Date1
WHEN Date2 >= Date1 AND Date2 >= Date3 THEN Date2
WHEN Date3 >= Date1 AND Date3 >= Date2 THEN Date3
ELSE Date1
END AS MostRecentDate
FROM #indebtedness
In my view, Pivot is the best and efficient option for this query. Copy and Paste in the MS SQL SERVER. Please check the code written below:
CREATE TABLE #indebtedness (call_case CHAR(10), date1 DATETIME, date2 DATETIME, date3 DATETIME)
INSERT #indebtedness VALUES ('Key1', '2019-10-30', '2019-11-30', '2019-10-31')
INSERT #indebtedness VALUES ('Key2', '2019-10-20', '2019-10-30', '2019-11-21')
INSERT #indebtedness VALUES ('Key3', '2019-11-11', '2019-10-29', '2019-10-30')
INSERT #indebtedness VALUES ('Key4', Null, '2019-10-29', '2019-10-13')
--Solution-1:
SELECT
call_case,
MAX(RecnetDate) as MaxDateColumn
FROM #indebtedness
UNPIVOT
(RecnetDate FOR COL IN ([date1], [date2], [date3])) as TRANSPOSE
GROUP BY call_case
--Solution-2:
select
call_case, case
when date1>date2 and date1 > date3 then date1
when date2>date3 then date2
when date3>date1 then date1
else date3 end as date
from #indebtedness as a
Drop table #indebtedness
This really should be re-evaluated at the design level as others have indicated. Below is an example of a different design using two tables to better accomplish what it appears you are looking for in your results. This will make growth much more favorable.
Here is an example (different table names used):
-- Drop pre-existing tables
DROP TABLE #call_log
DROP TABLE #case_type
-- Create table for Case Types
CREATE TABLE #case_type (id INT PRIMARY KEY CLUSTERED NOT NULL,
descript VARCHAR(50) NOT NULL)
INSERT #case_type VALUES (1,'No Answer')
INSERT #case_type VALUES (2,'Answer')
INSERT #case_type VALUES (3,'Not Exist')
INSERT #case_type VALUES (4,'whatsapp')
INSERT #case_type VALUES (5,'autodial')
INSERT #case_type VALUES (6,'SMS')
-- Create a Call Log table with a primary identity key and also an index on the call types
CREATE TABLE #call_log (call_num BIGINT PRIMARY KEY CLUSTERED IDENTITY NOT NULL,
call_type INT NOT NULL REFERENCES #case_type(id), call_date DATETIME)
CREATE NONCLUSTERED INDEX ix_call_log_entry_type ON #call_log(call_type)
INSERT #call_log(call_type, call_date) VALUES (1,'2019-11-30')
INSERT #call_log(call_type, call_date) VALUES (2,'2019-10-15')
INSERT #call_log(call_type, call_date) VALUES (3,null)
INSERT #call_log(call_type, call_date) VALUES (3,'2019-10-29')
INSERT #call_log(call_type, call_date) VALUES (1,'2019-10-25')
INSERT #call_log(call_type, call_date) VALUES (2,'2019-10-30')
INSERT #call_log(call_type, call_date) VALUES (3,'2019-10-13')
INSERT #call_log(call_type, call_date) VALUES (2,'2019-10-20')
INSERT #call_log(call_type, call_date) VALUES (1,'2019-10-30')
-- use an aggregate to show only the latest date for each case type
SELECT DISTINCT ct.descript, MAX(cl.call_date) AS "Date"
FROM #call_log cl JOIN #case_type ct ON cl.call_type = ct.id GROUP BY ct.descript
This allows for more case types to be added, many more log entries to be added and provides a better design.
This is just an example for learning purposes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With