I am just starting on Teradata and I have come across an Ordered Analytical Function called "Rows unbounded preceding" in Teradata. I tried several sites to learn about the function but all of them uses a complicated example explaining the same. Could you please provide me with a naive example so that I can get the basics clear?
The frame, ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, means that the window consists of the first row of the partition and all the rows up to the current row. Each calculation is done over a different set of rows. For example, when performing the calculation for row 4, the rows 1 to 4 are used.
UNBOUNDED PRECEDING – All rows before the current row. n PRECEDING – n rows before the current row. CURRENT ROW – Just the current row. n FOLLOWING – n rows after the current row. UNBOUNDED FOLLOWING – All rows after the current row.
ROWS PRECEDING and ROWS FOLLOWINGPreceding refers to rows before the current row, where the following argument refers to rows after the current row. We can specify a fixed value for preceding or following, or as we will see later on, we can determine the limits of the partition with the UNBOUNDED.
The Cumulative Sum (CSUM) function provides a running or cumulative total for a column's numeric value. This allows users to see what is happening with column totals over an ongoing progression.
It's the "frame" or "range" clause of window functions, which are part of the SQL standard and implemented in many databases, including Teradata.
A simple example would be to calculate the average amount in a frame of three days. I'm using PostgreSQL syntax for the example, but it will be the same for Teradata:
WITH data (t, a) AS ( VALUES(1, 1), (2, 5), (3, 3), (4, 5), (5, 4), (6, 11) ) SELECT t, a, avg(a) OVER (ORDER BY t ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING) FROM data ORDER BY t
... which yields:
t a avg ---------- 1 1 3.00 2 5 3.00 3 3 4.33 4 5 4.00 5 4 6.67 6 11 7.50
As you can see, each average is calculated "over" an ordered frame consisting of the range between the previous row (1 preceding
) and the subsequent row (1 following
).
When you write ROWS UNBOUNDED PRECEDING
, then the frame's lower bound is simply infinite. This is useful when calculating sums (i.e. "running totals"), for instance:
WITH data (t, a) AS ( VALUES(1, 1), (2, 5), (3, 3), (4, 5), (5, 4), (6, 11) ) SELECT t, a, sum(a) OVER (ORDER BY t ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM data ORDER BY t
yielding...
t a sum --------- 1 1 1 2 5 6 3 3 9 4 5 14 5 4 18 6 11 29
Here's another very good explanations of SQL window functions.
ROWS UNBOUNDED PRECEDING
is no Teradata-specific syntax, it's Standard SQL. Together with the ORDER BY
it defines the window on which the result is calculated.
Logically a Windowed Aggregate Function is newly calculated for each row within the PARTITION based on all ROWS between a starting row and an ending row.
Starting and ending rows might be fixed or relative to the current row based on the following keywords:
Possible kinds of calculation include:
So SUM(x) OVER (ORDER BY col ROWS UNBOUNDED PRECEDING)
results in a Cumulative Sum or Running Total
11 -> 11 2 -> 11 + 2 = 13 3 -> 13 + 3 (or 11+2+3) = 16 44 -> 16 + 44 (or 11+2+3+44) = 60
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With