I have a table which has Year, Month and few numeric columns
Year Month Total
2011 10 100
2011 11 150
2011 12 100
2012 01 50
2012 02 200
Now, I want to SELECT
rows between 2011 Nov and 2012 FEB. Note that I want the to Query to use range. Just as if I had a date column in the table..
Coming up with a way to use BETWEEN with the table as it is will work, but will be worse performance in every case:
I suggest the following instead if you have an index on your date columns and care at all about performance:
DECLARE
@FromDate date = '20111101',
@ToDate date = '20120201';
SELECT *
FROM dbo.YourTable T
WHERE
(
T.[Year] > Year(@FromDate)
OR (
T.[Year] = Year(@FromDate)
AND T.[Month] >= Month(@FromDate)
)
) AND (
T.[Year] < Year(@ToDate)
OR (
T.[Year] = Year(@ToDate)
AND T.[Month] <= Month(@ToDate)
)
);
However, it is understandable that you don't want to use such a construction as it is very awkward. So here is a compromise query, that at least uses numeric computation and will use less CPU than date-to-string-conversion computation (though not enough less to make up for the forced scan which is the real performance problem).
SELECT *
FROM dbo.YourTable T
WHERE
T.[Year] * 100 + T.[Month] BETWEEN 201111 AND 201202;
If you have an index on Year
, you can get a big boost by submitting the query as follows, which has the opportunity to seek:
SELECT *
FROM dbo.YourTable T
WHERE
T.[Year] * 100 + T.[Month] BETWEEN 201111 AND 201202
AND T.[Year] BETWEEN 2011 AND 2012; -- allows use of an index on [Year]
While this breaks your requirement of using a single BETWEEN
expression, it is not too much more painful and will perform very well with the Year
index.
You can also change your table. Frankly, using separate numbers for your date parts instead of a single column with a date data type is not good. The reason it isn't good is because of the exact issue you are facing right now--it is very hard to query.
In some data warehousing scenarios where saving bytes matters a lot, I could envision situations where you might store the date as a number (such as 201111
) but that is not recommended. The best solution is to change your table to use dates instead of splitting out the numeric value of the month and the year. Simply store the first day of the month, recognizing that it stands in for the entire month.
If changing the way you use these columns is not an option but you can still change your table, then you can add a persisted computed column:
ALTER Table dbo.YourTable
ADD ActualDate AS (DateAdd(year, [Year] - 1900, DateAdd(month, [Month], '18991201')))
PERSISTED;
With this you can just do:
SELECT *
FROM dbo.YourTable
WHERE
ActualDate BETWEEN '20111101' AND '20120201';
The PERSISTED
keyword means that while you still will get a scan, it won't have to do any calculation on each row since the expression is calculated on each INSERT or UPDATE and stored in the row. But you can get a seek if you add an index on this column, which will make it perform very well (though all in all, this is still not as ideal as changing to use an actual date column, because it will take more space and will affect INSERTs and UPDATEs):
CREATE NONCLUSTERED INDEX IX_YourTable_ActualDate ON dbo.YourTable (ActualDate);
Summary: if you truly can't change the table in any way, then you are going to have to make a compromise in some way. It will not be possible to get the simple syntax you want that will also perform well, when your dates are stored split into separate columns.
(Year > @FromYear OR Year = @FromYear AND Month >= @FromMonth)
AND (Year < @ToYear OR Year = @ToYear AND Month <= @ToMonth)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With