This is my code:
USE [tempdb];
GO
IF OBJECT_ID(N'dbo.t') IS NOT NULL
BEGIN
DROP TABLE dbo.t
END
GO
CREATE TABLE dbo.t
(
a NVARCHAR(8),
b NVARCHAR(8)
);
GO
INSERT t VALUES ('a', 'b');
INSERT t VALUES ('a', 'b');
INSERT t VALUES ('a', 'b');
INSERT t VALUES ('c', 'd');
INSERT t VALUES ('c', 'd');
INSERT t VALUES ('c', 'd');
INSERT t VALUES ('c', 'd');
INSERT t VALUES ('e', NULL);
INSERT t VALUES (NULL, NULL);
INSERT t VALUES (NULL, NULL);
INSERT t VALUES (NULL, NULL);
INSERT t VALUES (NULL, NULL);
GO
SELECT a, b,
COUNT(*) OVER (ORDER BY a)
FROM t;
On this page of BOL, Microsoft says that:
If PARTITION BY is not specified, the function treats all rows of the query result set as a single group.
So based on my understanding, the last SELECT
statement will give me the following result. Since all records are considered as in one single group, right?
a b
-------- -------- -----------
NULL NULL 12
NULL NULL 12
NULL NULL 12
NULL NULL 12
a b 12
a b 12
a b 12
c d 12
c d 12
c d 12
c d 12
e NULL 12
But the actual result is:
a b
-------- -------- -----------
NULL NULL 4
NULL NULL 4
NULL NULL 4
NULL NULL 4
a b 7
a b 7
a b 7
c d 11
c d 11
c d 11
c d 11
e NULL 12
Anyone can help to explain why? Thanks.
Window functions might have the following arguments in their OVER clause: PARTITION BY that divides the query result set into partitions. ORDER BY that defines the logical order of the rows within each partition of the result set.
Did you know that you can use the SQL Server aggregate functions SUM, COUNT, MAX, MIN and AVG with an OVER Clause now? Using an OVER clause you can produce individual record values along with aggregate values to different levels, without using a GROUP BY clause.
The order of filters in the WHERE clause does not matter. The SFDC Optimizer evaluates all filters to look for the indexed and most selective one.
COUNT(*) returns the number of rows in a specified table, and it preserves duplicate rows. It counts each row separately. This includes rows that contain null values.
It gives a running total (this functionality was not implemented in SQL Server until version 2012.)
The ORDER BY
defines the window to be aggregated with UNBOUNDED PRECEDING
and CURRENT ROW
as the default when not specified. SQL Server defaults to the less well performing RANGE
option rather than ROWS
.
They have different semantics in the case of ties in that the window for the RANGE
version includes not just the current row (and preceding rows) but also any additional tied rows with the same value of a
as the current row. This can be seen in the number of rows counted by each in the results below.
SELECT a,
b,
COUNT(*) OVER (ORDER BY a
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS [Rows],
COUNT(*) OVER (ORDER BY a
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS [Range],
COUNT(*) OVER() AS [Over()]
FROM t;
Returns
a b Rows Range Over()
-------- -------- ----------- ----------- -----------
NULL NULL 1 4 12
NULL NULL 2 4 12
NULL NULL 3 4 12
NULL NULL 4 4 12
a b 5 7 12
a b 6 7 12
a b 7 7 12
c d 8 11 12
c d 9 11 12
c d 10 11 12
c d 11 11 12
e NULL 12 12 12
To achieve the result that you were expecting to get omit both the PARTITION BY
and ORDER BY
and use an empty OVER()
clause (also shown above).
If ROWS/RANGE is not specified but ORDER BY is specified, RANGE UNBOUNDED PRECEDING AND CURRENT ROW is used as the default for window frame So what does that mean, let's focus on "UNBOUNDED PRECEDING AND CURRENT ROW". This gives a running total from the starting row to the current row. But in case if you want to have an overall count then you can also specify
"UNBOUNDED PRECEDING AND UNBOUNDED Following" This considers entire data set and Over() is just a shortcut of this
select a,b,
count(*) over(order by a) as [count],
COUNT(*) OVER (ORDER BY a
RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS [Range],
COUNT(*) OVER (ORDER BY a
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS [Rows],
COUNT(*) OVER (ORDER BY a
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED Following) AS [Range_Unbounded_following],
COUNT(*) OVER (ORDER BY a
ROWs BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED Following) AS [Row_Unbounded_following]
,COUNT(*) OVER () AS [Plain_over]
from t
order by [count]
Result is
a b count Range Rows Range_Unbounded_following Row_Unbounded_following Plain_over
-------- -------- ----------- ----------- ----------- ------------------------- ----------------------- -----------
NULL NULL 4 4 1 12 12 12
NULL NULL 4 4 2 12 12 12
NULL NULL 4 4 3 12 12 12
NULL NULL 4 4 4 12 12 12
a b 7 7 5 12 12 12
a b 7 7 6 12 12 12
a b 7 7 7 12 12 12
c d 11 11 8 12 12 12
c d 11 11 9 12 12 12
c d 11 11 10 12 12 12
c d 11 11 11 12 12 12
e NULL 12 12 12 12 12 12
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With