Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server BETWEEN

I have a table which has Year, Month and few numeric columns

Year   Month  Total
2011     10    100
2011     11    150
2011     12    100  
2012     01    50
2012     02    200

Now, I want to SELECT rows between 2011 Nov and 2012 FEB. Note that I want the to Query to use range. Just as if I had a date column in the table..

like image 532
user173552 Avatar asked Dec 26 '22 12:12

user173552


2 Answers

Coming up with a way to use BETWEEN with the table as it is will work, but will be worse performance in every case:

  • It will at best consume more CPU to do some kind of calculation on the rows instead of working with them as dates.
  • It will at the very worst force a table scan on every row in the table, but if your columns have indexes, then with the right query a seek is possible. This could be a HUGE performance difference, because forcing the constraints into a BETWEEN clause will disable using the index.

I suggest the following instead if you have an index on your date columns and care at all about performance:

DECLARE
   @FromDate date = '20111101',
   @ToDate date = '20120201';

SELECT *
FROM dbo.YourTable T
WHERE
   (
      T.[Year] > Year(@FromDate)
      OR (    
         T.[Year] = Year(@FromDate)
         AND T.[Month] >= Month(@FromDate)
      )
   ) AND (
      T.[Year] < Year(@ToDate)
      OR (
         T.[Year] = Year(@ToDate)
         AND T.[Month] <= Month(@ToDate)
      )
   );

However, it is understandable that you don't want to use such a construction as it is very awkward. So here is a compromise query, that at least uses numeric computation and will use less CPU than date-to-string-conversion computation (though not enough less to make up for the forced scan which is the real performance problem).

SELECT *
FROM dbo.YourTable T
WHERE
   T.[Year] * 100 + T.[Month] BETWEEN 201111 AND 201202;

If you have an index on Year, you can get a big boost by submitting the query as follows, which has the opportunity to seek:

SELECT *
FROM dbo.YourTable T
WHERE
   T.[Year] * 100 + T.[Month] BETWEEN 201111 AND 201202
   AND T.[Year] BETWEEN 2011 AND 2012; -- allows use of an index on [Year]

While this breaks your requirement of using a single BETWEEN expression, it is not too much more painful and will perform very well with the Year index.

You can also change your table. Frankly, using separate numbers for your date parts instead of a single column with a date data type is not good. The reason it isn't good is because of the exact issue you are facing right now--it is very hard to query.

In some data warehousing scenarios where saving bytes matters a lot, I could envision situations where you might store the date as a number (such as 201111) but that is not recommended. The best solution is to change your table to use dates instead of splitting out the numeric value of the month and the year. Simply store the first day of the month, recognizing that it stands in for the entire month.

If changing the way you use these columns is not an option but you can still change your table, then you can add a persisted computed column:

ALTER Table dbo.YourTable
   ADD ActualDate AS (DateAdd(year, [Year] - 1900, DateAdd(month, [Month], '18991201')))
   PERSISTED;

With this you can just do:

SELECT *
FROM dbo.YourTable
WHERE
   ActualDate BETWEEN '20111101' AND '20120201';

The PERSISTED keyword means that while you still will get a scan, it won't have to do any calculation on each row since the expression is calculated on each INSERT or UPDATE and stored in the row. But you can get a seek if you add an index on this column, which will make it perform very well (though all in all, this is still not as ideal as changing to use an actual date column, because it will take more space and will affect INSERTs and UPDATEs):

CREATE NONCLUSTERED INDEX IX_YourTable_ActualDate ON dbo.YourTable (ActualDate);

Summary: if you truly can't change the table in any way, then you are going to have to make a compromise in some way. It will not be possible to get the simple syntax you want that will also perform well, when your dates are stored split into separate columns.

like image 110
ErikE Avatar answered Jan 11 '23 06:01

ErikE


(Year > @FromYear OR Year = @FromYear AND Month >= @FromMonth)
AND (Year < @ToYear OR Year = @ToYear AND Month <= @ToMonth)
like image 21
Dr. Wily's Apprentice Avatar answered Jan 11 '23 07:01

Dr. Wily's Apprentice