Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TSQL DateTime to DateKey Int

In Scaling Up Your Data Warehouse with SQL Server 2008 R2, the author recommends using an integer date key in the format of YYYYMMDD as a clustered index on your fact tables to help optimize query speed.

What is the best way to convert your key date field to the Date Key? I feel the following would work, but is a bit sloppy:

select Replace(CONVERT(varchar,GETDATE(),102),'.','')

Clearly, I'm not using getdate, but rather a date column in the table that will be using in my aggregations.

First, how would you suggest making this conversion? Is my idea acceptable?

Second, has anyone had much success using the Date Key as a clustered index?

like image 276
jreed350z Avatar asked Apr 10 '12 15:04

jreed350z


People also ask

How do I convert time to INT in SQL?

Assuming you are looking for the "time" analogy to the "date" portion of your code which takes YYYYMMDD and turns it into an INT , you can: start with the HH:mm:ss format given by the style number 108. remove the colons to get that string into HHmmss. then convert that to INT.

How do I convert a date to an integer?

strftime() object. In this method, we are using strftime() function of datetime class which converts it into the string which can be converted to an integer using the int() function. Returns : It returns the string representation of the date or time object.

What is DateKey in SQL?

In our SQL Server data warehouse, our date or time primary keys are integers, which are also FK's into our Date and Time dimensions. For example: DateKey: 20130401 --> 1st April 2013. TimeKey: 5 --> 00:00:06. Sometimes we need to combine these back into a date or datetime, for example for calculating time differences.


1 Answers

ISO long (112) would do the trick:

SELECT CONVERT(INT, CONVERT(VARCHAR(8), GETDATE(), 112))

Casting getdate() straight to int with ISO 112 gives 41008 for some reason, but going via a VARCHAR seems to work - i'll update if i think of a faster cast.

EDIT: In regards to the int only vs varchar debate, here are my findings (repeatable on my test rig & production server) Varchar method uses less cpu time for half a million casts but a fraction slower overall - negligible unless your dealing with billions of rows

EDIT 2: Revised test case to clear cache and differnt dates

DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;
SET STATISTICS TIME ON;
WITH    RawDates ( [Date] )
          AS ( SELECT TOP 500000
                        DATEADD(DAY, N, GETDATE())
               FROM     TALLY
             )
    SELECT  YEAR([Date]) * 10000 + MONTH([Date]) * 100 + DAY([Date])
    FROM    RawDates
SET STATISTICS TIME OFF 

(500000 row(s) affected)

 SQL Server Execution Times:
   CPU time = 218 ms,  elapsed time = 255ms.    
DBCC FREEPROCCACHE;
DBCC DROPCLEANBUFFERS;
SET STATISTICS TIME ON;
WITH    RawDates ( [Date] )
          AS ( SELECT TOP 500000
                        DATEADD(DAY, N, GETDATE())
               FROM     TALLY
             )
    SELECT  CONVERT(INT, CONVERT(VARCHAR(8), [Date], 112))
    FROM    RawDates
SET STATISTICS TIME OFF 

(500000 row(s) affected)

 SQL Server Execution Times:
   CPU time = 266 ms,  elapsed time = 602ms
like image 105
HeavenCore Avatar answered Jan 04 '23 01:01

HeavenCore