Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL window function with a where clause?

Tags:

I'm trying to correlate two types of events for users. I want to see all event "B"s along with the most recent event "A" for that user prior to the "A" event. How would one accomplish this? In particular, I'm trying to do this in Postgres.

I was hoping it was possible to use a "where" clause in a window function, in which case I could essentially do a LAG() with a "where event='A'", but that doesn't seem to be possible.

Any recommendations?

Data example:

|user |time|event| |-----|----|-----| |Alice|1   |A    | |Bob  |2   |A    | |Alice|3   |A    | |Alice|4   |B    | |Bob  |5   |B    | |Alice|6   |B    | 

Desired result:

|user |event_b_time|last_event_a_time| |-----|------------|-----------------| |Alice|4           |3                | |Bob  |5           |2                | |Alice|6           |3                | 
like image 984
MJ. Avatar asked Sep 07 '16 20:09

MJ.


People also ask

Can you use a WHERE clause in a window function?

You cannot use window functions in WHERE , GROUP BY , or HAVING .

What is a windowed function SQL?

A window function performs a calculation across a set of table rows that are somehow related to the current row. This is comparable to the type of calculation that can be done with an aggregate function.

Is over a window function in SQL?

OVER is just to signify that this is a window function. PARTITION BY divides the rows into partitions so we can specify which rows to use to compute the window function. partition_list is the name of the column(s) we want to partition by. ORDER BY is used so that we can order the rows within each partition.


1 Answers

Just tried Gordon's approach using PostgreSQL 9.5.4, and it complained that

FILTER is not implemented for non-aggregate window functions

which means using lag() with FILTER is not allowed. So I modified Gordon's query using max(), a different window frame, and CTE:

WITH subq AS (   SELECT     "user", event, time as event_b_time,     max(time) FILTER (WHERE event = 'A') OVER (       PARTITION BY "user"       ORDER BY time       ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING     ) AS last_event_a_time   FROM events   ORDER BY time ) SELECT   "user", event_b_time, last_event_a_time FROM subq WHERE event = 'B'; 

Verified that this works with PostgreSQL 9.5.4.

Thanks to Gordon for the FILTER trick!

like image 90
Cheng Lian Avatar answered Sep 19 '22 12:09

Cheng Lian