I am trying to calculate running count for each 2 rows like below,
CREATE TABLE sales
(
EmpId INT,
Yr INT,
Sales DECIMAL(8,2)
)
INSERT INTO sales (EmpId, Yr, Sales)
VALUES (1, 2005, 12000), (1, 2006, 18000), (1, 2007, 25000),
(1, 2008, 25000), (1, 2009, 25000),
(2, 2005, 15000), (2, 2006, 6000), (2, 2007, 6000)
SELECT
EmpId, Yr, sales,
SUM(Sales) OVER (PARTITION BY empid ORDER BY empid ROWS BETWEEN 2 PRECEDING AND CURRENT ROW ) AS TotalSales2
FROM
sales
Output:
EmpId Yr sales TotalSales2
-----------------------------------
1 2005 12000 12000
1 2006 18000 30000
1 2007 25000 55000
1 2008 25000 68000
1 2009 25000 75000
2 2005 15000 15000
2 2006 6000 21000
2 2007 6000 27000
But expected output:
EmpId Yr Sales TotalSales2
-----------------------------------
1 2005 12000 12000
1 2006 18000 30000
1 2007 25000 25000
1 2008 25000 50000
1 2009 25000 25000
2 2005 15000 15000
2 2006 6000 21000
2 2007 6000 6000
What am I doing wrong in this query?
Note: SQL Servre version is 2012.
If you need a quick way to count rows that contain data, select all the cells in the first column of that data (it may not be column A). Just click the column header. The status bar, in the lower-right corner of your Excel window, will tell you the row count.
The SQL COUNT( ) function is used to return the number of rows in a table. It is used with the Select( ) statement.
Use the COUNT aggregate function to count the number of rows in a table. This function takes the name of the column as its argument (e.g., id ) and returns the number of rows for this particular column in the table (e.g., 5).
You can use the COUNT function to count column values, or you can use the COUNTROWS function to count table rows. Both functions will achieve the same result, providing that the counted column contains no BLANKs. The following measure definition presents an example. It calculates the number of OrderDate column values.
SELECT EmpId, Yr, Sales,
CASE WHEN ROW_NUMBER() OVER (PARTITION BY EmpId ORDER BY yr) % 2 = 0
THEN sales + lag(sales, 1, 0) OVER (PARTITION BY empid ORDER BY yr)
ELSE sales
END AS TotalSales2
FROM sales
Lag returns the previous row's value - when row_number() is even, add the current row's value to the previous row - otherwise, just show the sales for the current row. Partition each by EmpId, order each by yr - output matches the expected.
Also, thanks so much for adding the DDL/sample data.
The expression:
SUM(Sales) OVER (PARTITION BY empid
ORDER BY empid
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
calculates the sum considering the current row and the 2 rows immediately preceding it. So it actually calculates a rolling sum, which is what you really don't want.
I think you are actually looking for something like the following:
;WITH CTE_Group AS (
SELECT EmpId, Yr, sales,
(ROW_NUMBER() OVER (PARTITION BY empid ORDER BY yr) + 1 ) / 2 AS grp
FROM sales
)
SELECT EmpId, Yr, sales,
SUM(sales) OVER (PARTITION BY empid, grp
ORDER BY yr) AS TotalSales2
FROM CTE_Group
The above query uses a CTE
in order to calculate field grp
: the value of this field is 1
for the first two records of an empid
partition, 2
for the next two records, and so on.
Using grp
we can calculate the running total of sales
for groups of 2 as is the requirement of the OP.
Demo here
Edit:
To offset a larger group of records try using (credit goes to @Max Szczurek for pointing this out):
(ROW_NUMBER() OVER (PARTITION BY empid ORDER BY yr) - 1 ) / n AS grp
where n
is the number of records each group contains.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With