Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running count for each 2 rows

I am trying to calculate running count for each 2 rows like below,

CREATE TABLE sales
(
     EmpId INT, 
     Yr INT, 
     Sales DECIMAL(8,2)
)

INSERT INTO sales (EmpId, Yr, Sales)
VALUES (1, 2005, 12000), (1, 2006, 18000), (1, 2007, 25000),
       (1, 2008, 25000), (1, 2009, 25000),
       (2, 2005, 15000), (2, 2006, 6000), (2, 2007, 6000)

SELECT 
    EmpId, Yr, sales, 
    SUM(Sales) OVER (PARTITION BY empid ORDER BY empid ROWS BETWEEN 2 PRECEDING AND CURRENT ROW ) AS TotalSales2
FROM 
    sales

Output:

EmpId   Yr      sales   TotalSales2
-----------------------------------
  1     2005    12000      12000
  1     2006    18000      30000
  1     2007    25000      55000
  1     2008    25000      68000
  1     2009    25000      75000
  2     2005    15000      15000
  2     2006     6000      21000
  2     2007     6000      27000

But expected output:

EmpId   Yr     Sales    TotalSales2
-----------------------------------
  1     2005    12000   12000
  1     2006    18000   30000
  1     2007    25000   25000   
  1     2008    25000   50000
  1     2009    25000   25000   
  2     2005    15000   15000
  2     2006     6000   21000
  2     2007     6000    6000

What am I doing wrong in this query?

Note: SQL Servre version is 2012.

like image 526
MMMMS Avatar asked Mar 29 '17 06:03

MMMMS


People also ask

How do you count rows?

If you need a quick way to count rows that contain data, select all the cells in the first column of that data (it may not be column A). Just click the column header. The status bar, in the lower-right corner of your Excel window, will tell you the row count.

Which is used to count number of rows?

The SQL COUNT( ) function is used to return the number of rows in a table. It is used with the Select( ) statement.

How do I count rows in a table?

Use the COUNT aggregate function to count the number of rows in a table. This function takes the name of the column as its argument (e.g., id ) and returns the number of rows for this particular column in the table (e.g., 5).

What is a difference between count and row count?

You can use the COUNT function to count column values, or you can use the COUNTROWS function to count table rows. Both functions will achieve the same result, providing that the counted column contains no BLANKs. The following measure definition presents an example. It calculates the number of OrderDate column values.


2 Answers

SELECT EmpId, Yr, Sales, 
    CASE WHEN ROW_NUMBER() OVER (PARTITION BY EmpId ORDER BY yr) % 2 = 0 
    THEN sales + lag(sales, 1, 0) OVER (PARTITION BY empid ORDER BY yr) 
    ELSE sales 
    END AS TotalSales2
FROM sales

Lag returns the previous row's value - when row_number() is even, add the current row's value to the previous row - otherwise, just show the sales for the current row. Partition each by EmpId, order each by yr - output matches the expected.

Also, thanks so much for adding the DDL/sample data.

like image 95
Max Szczurek Avatar answered Sep 18 '22 09:09

Max Szczurek


The expression:

SUM(Sales) OVER (PARTITION BY empid 
                 ORDER BY empid 
                 ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)

calculates the sum considering the current row and the 2 rows immediately preceding it. So it actually calculates a rolling sum, which is what you really don't want.

I think you are actually looking for something like the following:

;WITH CTE_Group AS (
    SELECT EmpId, Yr, sales,        
          (ROW_NUMBER() OVER (PARTITION BY empid ORDER BY yr) + 1 ) / 2 AS grp
    FROM sales      
)
SELECT EmpId, Yr, sales,
       SUM(sales) OVER (PARTITION BY empid, grp 
                        ORDER BY yr) AS TotalSales2
FROM CTE_Group

The above query uses a CTE in order to calculate field grp: the value of this field is 1 for the first two records of an empid partition, 2 for the next two records, and so on.

Using grp we can calculate the running total of sales for groups of 2 as is the requirement of the OP.

Demo here

Edit:

To offset a larger group of records try using (credit goes to @Max Szczurek for pointing this out):

(ROW_NUMBER() OVER (PARTITION BY empid ORDER BY yr) - 1 ) / n AS grp

where n is the number of records each group contains.

like image 29
Giorgos Betsos Avatar answered Sep 19 '22 09:09

Giorgos Betsos