Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Selecting latest consecutive records that match a condition with PostgreSQL

Tags:

sql

postgresql

I am looking for a PostgreSQL query to find the latest consecutive records that match a condition. Let me explain it better with an example:

|  ID  |  HEATING STATE  |  DATE       |
| ---- | --------------- |  ---------- |
|  1   |  ON             |  2018-02-19 |
|  2   |  ON             |  2018-02-20 |
|  3   |  OFF            |  2018-02-20 |
|  4   |  OFF            |  2018-02-21 |
|  5   |  ON             |  2018-02-21 |
|  6   |  OFF            |  2018-02-21 |
|  7   |  ON             |  2018-02-22 |
|  8   |  ON             |  2018-02-22 |
|  9   |  ON             |  2018-02-22 |
| 10   |  ON             |  2018-02-23 |

I need to find all the recent consecutive records with date >= 2018-02-20 and heating_state ON, i.e. the ones with ID 7, 8, 9, 10. My main issue is with the fact that they must be consecutive.

For further clarification, if needed:

  • ID 1 is excluded because older than 2018-02-20
  • ID 2 is excluded because followed by ID 3 which has heating state OFF
  • ID 3 is excluded because it has heating state OFF
  • ID 4 is excluded because it is followed by ID 5, which has heating OFF
  • ID 5 is excluded because it has heating state OFF
  • ID 6 is excluded because it has heating state OFF
like image 373
spaghetticode Avatar asked May 16 '18 15:05

spaghetticode


People also ask

Does PostgreSQL support with clause?

In PostgreSQL, the WITH query provides a way to write auxiliary statements for use in a larger query. It helps in breaking down complicated and large queries into simpler forms, which are easily readable.

How do I match a string in PostgreSQL?

We can compare the string using like clause in PostgreSQL, we can also compare the string using the =, != , <>, <, >, <= and >= character string operator. Basically character string operator in PostgreSQL is used to compare the string and return the result as we specified input within the query.


1 Answers

I think this is best solved using windows functions and a filtered aggregate.

For each row, add the number of later rows that have state = 'OFF', then use only the rows where that count is 0.

You need a subquery because you cannot use a window function result in the WHERE condition (WHERE is evaluated before window functions).

SELECT id, state, date
FROM (SELECT id, state, date,
             count(*) FILTER (WHERE state = 'OFF')
                OVER (ORDER BY date DESC, state DESC) AS later_off_count
      FROM tab) q
WHERE later_off_count = 0;

 id | state |    date    
----+-------+------------
 10 | ON    | 2018-02-23
  9 | ON    | 2018-02-22
  8 | ON    | 2018-02-22
  7 | ON    | 2018-02-22
(4 rows)
like image 136
Laurenz Albe Avatar answered Oct 22 '22 07:10

Laurenz Albe