The following query returns a correct result but how do I get the same result faster?
The goal is to output a table for tracking sellers progress by summarizing their sales today, this week, month and quarter.
SellerID Today ThisWeek ThisMonth ThisQuarter
----------- --------------------- --------------------- --------------------- ---------------------
1 400,00 700,00 900,00 900,00
2 950,00 1850,00 2650,00 2650,00
My query:
CREATE TABLE #sales(
[Price] MONEY,
[Date] DATE,
[SellerID] INT
)
INSERT INTO #sales VALUES
(100, '2012-01-01', 1),
(200, '2012-04-01',1),
(300, '2012-04-23',1),
(400, '2012-04-27',1),
(700, '2012-01-01', 2),
(700, '2012-01-02', 2),
(800, '2012-04-01',2),
(900, '2012-04-23',2),
(950, '2012-04-27',2)
SELECT
SellerID AS SellerID,
SUM(CASE WHEN [Date] >= DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()),0) THEN [Price] END) AS Today,
SUM(CASE WHEN [Date] >= DATEADD(WEEK, DATEDIFF(WEEK, 0, GETDATE()), 0) THEN [Price] END) AS ThisWeek,
SUM(CASE WHEN [Date] >= DATEADD(MONTH, DATEDIFF(MONTH, 0, GETDATE()), 0) THEN [Price] END) AS ThisMonth,
SUM(CASE WHEN [Date] >= DATEADD(QUARTER, DATEDIFF(QUARTER, 0, GETDATE()), 0) THEN [Price] END) AS ThisQuarter
FROM #sales
WHERE DATEPART(YEAR, [Date]) = DATEPART(YEAR, GETDATE())
GROUP BY SellerID
When executing the same query on a larger table this gets quite slow. Just removing the CASE-statements cuts the execution time by almost 50%.
How can I achieve the same result in a faster and more efficient way?
Since it is Friday afternoon, I thought I'd expand on my comment regarding warehousing. even if you cannot fully explore cubes with SSAS or any other OLAP you can still do your own report specific warehousing. In your case I would set up a new Database (I always call mine DW but the world is your oyster), and create 2 schemas Fact and Dim (representing facts and dimensions). In your case it would need 2 tables, although you may want to add another dimension for "SellerID" depending on if this needs further reporting on.
CREATE TABLE Dim.Date
( DateKey DATE NOT NULL,
DayOfWeek VARCHAR(20) NOT NULL,
Day TINYINT NOT NULL,
Week TINYINT NOT NULL,
Quarter TINYINT NOT NULL,
Month TINYINT NOT NULL,
Year SMALLINT NOT NULL
CONSTRAINT PK_Dim_Date_DateKey PRIMARY KEY (DateKey)
)
CREATE TABLE Fact.Sales
( DateKey DATE NOT NULL,
SellerID INT NOT NULL,
Sales INT NOT NULL,
Amount MONEY NOT NULL,
CONSTRAINT PK_Fact_Sales PRIMARY KEY (DateKey, SellerID),
CONSTRAINT FK_Fact_Sales_DateKey FOREIGN KEY (DateKey) REFERENCES Dim.Date
)
Assuming the data will not get backdated you can use a procedure like this to fill your warehouse on a scheduled job:
DECLARE @MaxDate DATE
SELECT @MaxDate = DATEADD(DAY, 1, MAX(DateKey))
FROM Fact.Sales
INSERT INTO Dim.Date
SELECT DATEADD(DAY, Increment, @MaxDate),
DATENAME(WEEKDAY, DATEADD(DAY, Increment, @MaxDate)),
DATEPART(DAY, DATEADD(DAY, Increment, @MaxDate)),
DATEPART(WEEK, DATEADD(DAY, Increment, @MaxDate)),
DATEPART(MONTH, DATEADD(DAY, Increment, @MaxDate)),
DATEPART(QUARTER, DATEADD(DAY, Increment, @MaxDate)),
DATEPART(YEAR, DATEADD(DAY, Increment, @MaxDate))
FROM ( SELECT ROW_NUMBER() OVER(ORDER BY Object_ID) - 1 [Increment]
FROM Sys.Objects
) obj
WHERE NOT EXISTS
( SELECT 1
FROM Dim.Date
WHERE Date.DateKey = DATEADD(DAY, Increment, @MaxDate)
)
INSERT INTO Fact.Sales
SELECT [Date], SellerID, COUNT(*), SUM(Price)
FROM LiveDatabase..Sales
WHERE [Date] >= @MaxDate
GROUP BY [Date], SellerID
This would leave you with the following query to produce your report
SELECT SellerID,
SUM(CASE WHEN Today.DateKey = Date.DateKey THEN Amount ELSE O END) [Today],
SUM(CASE WHEN Today.Week = Date.Week THEN Amount ELSE O END) [ThisWeek],
SUM(CASE WHEN Today.Month = Date.Month THEN Amount ELSE O END) [ThisMonth],
SUM(CASE WHEN Today.Quarter = Date.Quarter THEN Amount ELSE O END) [ThisQuarter],
SUM(CASE WHEN Today.Year = Date.Year THEN Amount ELSE O END) [ThisYear]
FROM Fact.Sales
INNER JOIN Dim.Date
ON Date.DateKey = Sales.DateKey
INNER JOIN Dim.Date Today
ON Today.DateKey = CAST(GETDATE() AS DATE)
AND Today.Year = Date.Year
GROUP BY SellerID
It looks, if anything, more complicated than the original query, but the more the online database grows the more you will see the benefit. I've done an SQL Fiddle to demonstrate the advantages, it fills the live data with 10000 random sales records, then creates a warehouse (It may take a few seconds to build the schema). You should notice the execution time of the query on the warehouse is significantly faster (c.20x). It may not be 20x faster on the first run, but once the query plan has been cached for both queries the warehouse query is consistently 20x faster (has been for me anyway).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With