Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The SQL OVER() clause - when and why is it useful?

    USE AdventureWorks2008R2; GO SELECT SalesOrderID, ProductID, OrderQty     ,SUM(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Total'     ,AVG(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Avg'     ,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Count'     ,MIN(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Min'     ,MAX(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'Max' FROM Sales.SalesOrderDetail  WHERE SalesOrderID IN(43659,43664); 

I read about that clause and I don't understand why I need it. What does the function Over do? What does Partitioning By do? Why can't I make a query with writing Group By SalesOrderID?

like image 698
WithFlyingColors Avatar asked Jun 02 '11 18:06

WithFlyingColors


People also ask

What is the use of over () in SQL?

Determines the partitioning and ordering of a rowset before the associated window function is applied. That is, the OVER clause defines a window or user-specified set of rows within a query result set. A window function then computes a value for each row in the window.

What is the significance of over () and partition by clauses?

A PARTITION BY clause is used to partition rows of table into groups. It is useful when we have to perform a calculation on individual rows of a group using other rows of that group. It is always used inside OVER() clause. The partition formed by partition clause are also known as Window.

Why we use over partition by in SQL Server?

It's recommended to use the SQL PARTITION BY clause while working with multiple data groups for the aggregated values in the individual group. Similarly, it can be used to view original rows with the additional column of aggregated values.

Which type of functions use over clause?

The OVER clause is essential to SQL window functions. Like aggregation functions, window functions perform calculations based on a set of records – e.g. finding the average salary across a group of employees.


2 Answers

You can use GROUP BY SalesOrderID. The difference is, with GROUP BY you can only have the aggregated values for the columns that are not included in GROUP BY.

In contrast, using windowed aggregate functions instead of GROUP BY, you can retrieve both aggregated and non-aggregated values. That is, although you are not doing that in your example query, you could retrieve both individual OrderQty values and their sums, counts, averages etc. over groups of same SalesOrderIDs.

Here's a practical example of why windowed aggregates are great. Suppose you need to calculate what percent of a total every value is. Without windowed aggregates you'd have to first derive a list of aggregated values and then join it back to the original rowset, i.e. like this:

SELECT   orig.[Partition],   orig.Value,   orig.Value * 100.0 / agg.TotalValue AS ValuePercent FROM OriginalRowset orig   INNER JOIN (     SELECT       [Partition],       SUM(Value) AS TotalValue     FROM OriginalRowset     GROUP BY [Partition]   ) agg ON orig.[Partition] = agg.[Partition] 

Now look how you can do the same with a windowed aggregate:

SELECT   [Partition],   Value,   Value * 100.0 / SUM(Value) OVER (PARTITION BY [Partition]) AS ValuePercent FROM OriginalRowset orig 

Much easier and cleaner, isn't it?

like image 62
Andriy M Avatar answered Oct 08 '22 13:10

Andriy M


The OVER clause is powerful in that you can have aggregates over different ranges ("windowing"), whether you use a GROUP BY or not

Example: get count per SalesOrderID and count of all

SELECT     SalesOrderID, ProductID, OrderQty     ,COUNT(OrderQty) AS 'Count'     ,COUNT(*) OVER () AS 'CountAll' FROM Sales.SalesOrderDetail  WHERE      SalesOrderID IN(43659,43664) GROUP BY      SalesOrderID, ProductID, OrderQty 

Get different COUNTs, no GROUP BY

SELECT     SalesOrderID, ProductID, OrderQty     ,COUNT(OrderQty) OVER(PARTITION BY SalesOrderID) AS 'CountQtyPerOrder'     ,COUNT(OrderQty) OVER(PARTITION BY ProductID) AS 'CountQtyPerProduct',     ,COUNT(*) OVER () AS 'CountAllAgain' FROM Sales.SalesOrderDetail  WHERE      SalesOrderID IN(43659,43664) 
like image 29
gbn Avatar answered Oct 08 '22 14:10

gbn