Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using postgres window functions with different order by clause

Tags:

postgresql

I have an issue using multiple order by in postgres window functions. Here is short example. Select total number of rows, N first and N last rows from in a single query. (It is not the task that I want to achieve just an example of the issue) Is it expected behavior or a bug in postgres? I'm using postgres 9.6

select generate_series(1, 10) id
into q;

select
       count(*) over (),
       lag(id , 0) over (order by id asc) a,
       lag(id , 0) over (order by id desc) d
from q
limit 5;

Output:

10,10,10
10,9,9
10,8,8
10,7,7
10,6,6

Expected:

10,1,10
10,2,9
10,3,8
10,4,7
10,5,6

Code work good if only N first or only N last rows are selected.

like image 795
ijrandom Avatar asked Feb 17 '26 00:02

ijrandom


1 Answers

I once had a similar problem which had the same explanation: https://stackoverflow.com/a/48668220/3984221


Explanation of the behaviour:

demo:db<>fiddle

You can explain this when you have a look into the EXPLAIN output:

> | QUERY PLAN                                                                                                                   |
> | :--------------------------------------------------------------------------------------------------------------------------- |
> | WindowAgg  (cost=368.69..445.19 rows=2550 width=20) (actual time=0.146..0.150 rows=10 loops=1)                               |
> |   -&gt;  WindowAgg  (cost=368.69..413.32 rows=2550 width=12) (actual time=0.128..0.136 rows=10 loops=1)                         |
> |         -&gt;  Sort  (cost=368.69..375.07 rows=2550 width=8) (actual time=0.126..0.128 rows=10 loops=1)                         |
> |               Sort Key: id                                                                                                   |
> |               Sort Method: quicksort  Memory: 25kB                                                                           |
> |               -&gt;  WindowAgg  (cost=179.78..224.41 rows=2550 width=8) (actual time=0.048..0.056 rows=10 loops=1)              |
> |                     -&gt;  Sort  (cost=179.78..186.16 rows=2550 width=4) (actual time=0.033..0.034 rows=10 loops=1)             |
> |                           Sort Key: id DESC                                                                                  |
> |                           Sort Method: quicksort  Memory: 25kB                                                               |
> |                           -&gt;  Seq Scan on q  (cost=0.00..35.50 rows=2550 width=4) (actual time=0.013..0.014 rows=10 loops=1) |
> | Planning Time: 0.292 ms                                                                                                      |
> | Execution Time: 0.445 ms                                                                                                     |

Here you can see: First there is a SORT Key: id DESC. So everything is ordered in DESC order. If you have only the DESC ordered function, this would be your result, as you already saw. Now, you have a second window function. So, the entire result will be sorted a second time, into the ASC order, incl. your first result. So, your first lag() result 10, 9, 8, 7, 6, ... will be ordered back into 1, 2, 3, 4, 5, ... Afterwards the second lag() result will be added.

However, your specific result for your lag() function is explainable, of course: You don't shift your data, so you get the current value. You can cross check this (as I did in the fiddle above), when you turn your 0 shift value into 1. Then your DESC lag() will return 2 for id 1, but ASC gives NULL. Everything's fine.

So, to create your expected output, you need another approach, e.g. using row_number() to add the row count in ASC and DESC order and filter them afterwards:

demo:db<>fiddle

SELECT
    COUNT(*) OVER (),
    a.id,
    d.id
FROM ( 
   select
       id,
       row_number() over (order by id asc)
   from q
) a
JOIN ( 
   select
       id,
       row_number() over (order by id desc)
   from q
) d ON a.row_number = d.row_number
LIMIT 5
like image 57
S-Man Avatar answered Feb 18 '26 15:02

S-Man