Running the following code:
val sales = Seq(
(0, 0, 0, 5),
(1, 0, 1, 3),
(2, 0, 2, 1),
(3, 1, 0, 2),
(4, 2, 0, 8),
(5, 2, 2, 8))
.toDF("id", "orderID", "prodID", "orderQty")
val orderedByID = Window.orderBy('id')
val totalQty = sum('orderQty').over(orderedByID).as('running_total')
val salesTotalQty = sales.select(*, totalQty).orderBy('id')
salesTotalQty.show()
The result is:
+---+-------+------+--------+-------------+
| id|orderID|prodID|orderQty|running_total|
+---+-------+------+--------+-------------+
| 0| 0| 0| 5| 5|
| 1| 0| 1| 3| 8|
| 2| 0| 2| 1| 9|
| 3| 1| 0| 2| 11|
| 4| 2| 0| 8| 19|
| 5| 2| 2| 8| 27|
+---+-------+------+--------+-------------+
There is no window frame defined in the above code, it looks the default window frame is rowsBetween(Window.unboundedPreceding, Window.currentRow)
Not sure my understanding about default window frame is correct
Overall, the main function of a frame is to hold the panes of the window in place—with the aid of spacers—and hold the window into the wall of a building. Taking into account the frame as well as the glazing when choosing a window is vitally important in ensuring that the energy efficiency of the window is suitable.
Using PARTITION BY to Define a Window Frame The window frame is a set of rows related to the current row where the window function is used for calculation. The window frame can be a different set of rows for the next row in the query result, since it depends on the current row being processed.
In SQL, a window function or analytic function is a function which uses values from one or multiple rows to return a value for each row. (This contrasts with an aggregate function, which returns a single value for multiple rows.)
Window functions in Snowflake are a way to compute values over a group of rows. They return a single value for each row, in contrast to aggregate functions which return a single value for a group of rows.
From Spark Gotchas
Default frame specification depends on other aspects of a given window defintion:
- if the ORDER BY clause is specified and the function accepts the frame specification, then the frame specification is defined by RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW,
- otherwise the frame specification is defined by ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With