How to use a ring data structure in window functions

Tags:

I have data that is arranged in a ring structure (or circular buffer), that is it can be expressed as sequences that cycle: ...-1-2-3-4-5-1-2-3-.... See this picture to get an idea of a 5-part ring:

enter image description here

I'd like to create a window query that can combine the lag and lead items into a three point array, but I can't figure it out. For example at part 1 of a 5-part ring, the lag/lead sequence is 5-1-2, or at part 4 is 3-4-5.

Here is an example table of two rings with different numbers of parts (always more than three per ring):

create table rp (ring int, part int);
insert into rp(ring, part) values(1, generate_series(1, 5));
insert into rp(ring, part) values(2, generate_series(1, 7));

Here is a nearly successful query:

SELECT ring, part, array[
    lag(part, 1, NULL) over (partition by ring),
    part,
    lead(part, 1, 1) over (partition by ring)
    ] AS neighbours
FROM rp;

 ring | part | neighbours
------+------+------------
    1 |    1 | {NULL,1,2}
    1 |    2 | {1,2,3}
    1 |    3 | {2,3,4}
    1 |    4 | {3,4,5}
    1 |    5 | {4,5,1}
    2 |    1 | {NULL,1,2}
    2 |    2 | {1,2,3}
    2 |    3 | {2,3,4}
    2 |    4 | {3,4,5}
    2 |    5 | {4,5,6}
    2 |    6 | {5,6,7}
    2 |    7 | {6,7,1}
(12 rows)

The only thing I need to do is to replace the NULL with the ending point of each ring, which is the last value. Now, along with lag and lead window functions, there is a last_value function which would be ideal. However, these cannot be nested:

SELECT ring, part, array[
    lag(part, 1, last_value(part) over (partition by ring)) over (partition by ring),
    part,
    lead(part, 1, 1) over (partition by ring)
    ] AS neighbours
FROM rp;
ERROR:  window function calls cannot be nested
LINE 2:     lag(part, 1, last_value(part) over (partition by ring)) ...

Update. Thanks to @Justin's suggestion to use coalesce to avoid nesting window functions. Furthermore, it has been pointed out by numerous folks that first/last values need an explicit order by on the ring sequence, which happens to be part for this example. So randomising the input data a bit:

create table rp (ring int, part int);
insert into rp(ring, part) select 1, generate_series(1, 5) order by random();
insert into rp(ring, part) select 2, generate_series(1, 7) order by random();

834

asked Jul 30 '14 06:07

Mike T

2 Answers

Use COALESCE like @Justin provided.
With first_value() / last_value() you need to add an ORDER BY clause to the window definition or the order is undefined. You just got lucky in the example, because the rows happen to be in order right after creating the dummy table.
Once you add ORDER BY, the default window frame ends at the current row, and you need to special case the last_value() call - or revert the sort order in the window frame like demonstrated in my first example.
When reusing a window definition multiple times, an explicit WINDOW clause simplifies syntax a lot:

SELECT ring, part, ARRAY[
          coalesce(
             lag(part) OVER w
            ,first_value(part) OVER (PARTITION BY ring ORDER BY part DESC))
         ,part
         ,coalesce(
             lead(part) OVER w
            ,first_value(part) OVER w)
         ] AS neighbours
FROM   rp
WINDOW w AS (PARTITION BY ring ORDER BY part);

Better yet, reuse the same window definition, so Postgres can calculate all values in a single scan. For this to work we need to define a custom window frame:

SELECT ring, part, ARRAY[
          coalesce(
             lag(part) OVER w
            ,last_value(part) OVER w)
         ,part
         ,coalesce(
             lead(part) OVER w
            ,first_value(part) OVER w)
         ] AS neighbours
FROM   rp
WINDOW w AS (PARTITION BY ring
             ORDER BY part
             RANGE BETWEEN UNBOUNDED PRECEDING
                       AND UNBOUNDED FOLLOWING)
ORDER  BY 1,2;

You can even adapt the frame definition for each window function call:

SELECT ring, part, ARRAY[
          coalesce(
             lag(part) OVER w
            ,last_value(part) OVER (w RANGE BETWEEN CURRENT ROW
                                                AND UNBOUNDED FOLLOWING))
         ,part
         ,coalesce(
             lead(part) OVER w
            ,first_value(part) OVER w)
         ] AS neighbours
FROM   rp
WINDOW w AS (PARTITION BY ring ORDER BY part)
ORDER  BY 1,2;

Might be faster for rings with many parts. You'll have to test.

SQL Fiddle demonstrating all three with an improved test case. Consider query plans.

More about window frame definitions:

In the manual.
PostgreSQL window function: partition by comparison
PostgreSQL query with max and min date plus associated id per row

150

answered Oct 09 '22 01:10

Erwin Brandstetter

Query:

SQLFIDDLEExample

SELECT ring, part, array[
    coalesce(lag(part, 1, NULL) over (partition by ring), 
             max(part) over (partition by ring)),
    part,
    lead(part, 1, 1) over (partition by ring)
    ] AS neighbours
FROM rp;

Result:

| RING | PART | NEIGHBOURS |
|------|------|------------|
|    1 |    1 |      5,1,2 |
|    1 |    2 |      1,2,3 |
|    1 |    3 |      2,3,4 |
|    1 |    4 |      3,4,5 |
|    1 |    5 |      4,5,1 |
|    2 |    1 |      7,1,2 |
|    2 |    2 |      1,2,3 |
|    2 |    3 |      2,3,4 |
|    2 |    4 |      3,4,5 |
|    2 |    5 |      4,5,6 |
|    2 |    6 |      5,6,7 |
|    2 |    7 |      6,7,1 |

answered Oct 09 '22 02:10

Justin

Related questions
                            
                                Dynamic Oracle Pivot_In_Clause
                            
                                How do I create an index inside a stored procedure?
                            
                                Get Max Count from Group by
                            
                                How to get sum of time field in SQL server 2008
                            
                                Rails: NameError: uninitialized constant on join table
                            
                                Mysql GROUP BY and ORDER BY DESC [duplicate]
                            
                                SQL vs NoSQL for data that will be presented to a user after multiple filters have been added
                            
                                SQL query dynamic table name in FOR
                            
                                Condition on joined table faster than condition on reference
                            
                                Mysql integer default value 0
                            
                                SQL Conditional Case
                            
                                Leading Comma or Trailing Comma? [closed]
                            
                                SQL Query with group by for multiple date ranges
                            
                                PostgreSQL SSPI authentication - FATAL: 2801: password authentication failed for user "xxx"
                            
                                How to get ResultSet from executeBatch?
                            
                                How to implement intersect using where exists or not exists in mysql
                            
                                Dynamic Pivot Results to a Temp Table
                            
                                Rails Join Model select joint columns
                            
                                Slick 2 - Update columns in a table and return whole table object
                            
                                Why are dot-separated prefixes ignored in the column list for INSERT statements?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to use a ring data structure in window functions

Tags:

sql

postgresql

window-functions

circular-buffer

circular-list

Mike T

People also ask

2 Answers

Erwin Brandstetter

Justin

Recent Activity

Donate For Us