I'm trying to write a complex query using PostgreSQL 9.2.4, and I'm having trouble getting it working. I have a table which contains a time range, as well as several other columns. When I store data in this table, if all of the columns are the same and the time ranges overlap or are adjacent, I combine them into one row. When I retrieve them, though, I want to split the ranges at day boundaries - so for example: <pre class="prettyprint"><code>2013-01-01 00:00:00 to 2013-01-02 23:59:59 </code></pre> would be selected as two rows: <pre class="prettyprint"><code>2013-01-01 00:00:00 to 2013-01-01 23:59:59 2013-01-02 00:00:00 to 2013-01-02 23:59:59 </code></pre> with the values in the other columns the same for both retrieved entries. I have seen this question which seems to more or less address what I want, but it's for a "very old" version of PostgreSQL, so I'm not sure it's really still applicable. I've also seen this question, which does exactly what I want, but as far as I know the <code>CONNECT BY</code> statement is an Oracle extension to the SQL standard, so I can't use it. I believe I can achieve this using PostgreSQL's <code>generate_series</code>, but I'm hoping there's a simple example out there demonstrating how it can be used to do this. This is the query I'm working on at the moment, which currently doesn't work (because I can't reference the <code>FROM</code> table in a joined subquery), but I believe this is more-or-less the right track. Here's the fiddle with the schema, sample data, and my working query. Update: I just found out a fun fact, thanks to this question, that if you use a set-returning function in the <code>SELECT</code> part of the query, PostgreSQL will "automagically" do a cross join on the set and the row. I think I'm close to getting this working.

There is simply solution (if intervals starts in same time) <pre class="prettyprint"> postgres=# select i, i + interval '1day' - interval '1sec' from generate_series('2013-01-01 00:00:00'::timestamp, '2013-01-02 23:59:59', '1day') g(i); i │ ?column? ─────────────────────┼───────────────────── 2013-01-01 00:00:00 │ 2013-01-01 23:59:59 2013-01-02 00:00:00 │ 2013-01-02 23:59:59 (2 rows) </pre> I wrote a table function, that do it for any interval. It is fast - two years range divide to 753 ranges in 10ms <pre class="prettyprint"> create or replace function day_ranges(timestamp, timestamp) returns table(t1 timestamp, t2 timestamp) as $$ begin t1 := $1; if $2 > $1 then loop if t1::date = $2::date then t2 := $2; return next; exit; end if; t2 := date_trunc('day', t1) + interval '1day' - interval '1sec'; return next; t1 := t2 + interval '1sec'; end loop; end if; return; end; $$ language plpgsql; </pre> Result: <pre class="prettyprint"> postgres=# select * from day_ranges('2013-10-08 22:00:00', '2013-10-10 23:00:00'); t1 │ t2 ─────────────────────┼───────────────────── 2013-10-08 22:00:00 │ 2013-10-09 23:59:59 2013-10-09 00:00:00 │ 2013-10-09 23:59:59 2013-10-10 00:00:00 │ 2013-10-10 23:00:00 (3 rows) Time: 6.794 ms </pre> and faster (and little bit longer) version based on RETURN QUERY <pre class="prettyprint"> create or replace function day_ranges(timestamp, timestamp) returns table(t1 timestamp, t2 timestamp) as $$ begin t1 := $1; t2 := $2; if $1::date = $2::date then return next; else -- first day t2 := date_trunc('day', t1) + interval '1day' - interval '1sec'; return next; if $2::date > $1::date + 1 then return query select d, d + interval '1day' - interval '1sec' from generate_series(date_trunc('day', $1 + interval '1day')::timestamp, date_trunc('day', $2 - interval '1day')::timestamp, '1day') g(d); end if; -- last day t1 := date_trunc('day', $2); t2 := $2; return next; end if; return; end; $$ language plpgsql; </pre>

PostgreSQL splitting time range into days

Tags:

sql

postgresql

date-range

generate-series

I'm trying to write a complex query using PostgreSQL 9.2.4, and I'm having trouble getting it working. I have a table which contains a time range, as well as several other columns. When I store data in this table, if all of the columns are the same and the time ranges overlap or are adjacent, I combine them into one row.

When I retrieve them, though, I want to split the ranges at day boundaries - so for example:

2013-01-01 00:00:00 to 2013-01-02 23:59:59

would be selected as two rows:

2013-01-01 00:00:00 to 2013-01-01 23:59:59
2013-01-02 00:00:00 to 2013-01-02 23:59:59

with the values in the other columns the same for both retrieved entries.

I have seen this question which seems to more or less address what I want, but it's for a "very old" version of PostgreSQL, so I'm not sure it's really still applicable.

I've also seen this question, which does exactly what I want, but as far as I know the CONNECT BY statement is an Oracle extension to the SQL standard, so I can't use it.

I believe I can achieve this using PostgreSQL's generate_series, but I'm hoping there's a simple example out there demonstrating how it can be used to do this.

This is the query I'm working on at the moment, which currently doesn't work (because I can't reference the FROM table in a joined subquery), but I believe this is more-or-less the right track.

Here's the fiddle with the schema, sample data, and my working query.

Update: I just found out a fun fact, thanks to this question, that if you use a set-returning function in the SELECT part of the query, PostgreSQL will "automagically" do a cross join on the set and the row. I think I'm close to getting this working.

758

asked Oct 18 '13 16:10

CmdrMoozy

2 Answers

First off, your upper border concept is broken. A timestamp with 23:59:59 is no good. The data type timestamp has fractional digits. What about 2013-10-18 23:59:59.123::timestamp?

Include the lower border and exclude the upper border everywhere in your logic. Compare:

Calculate number of concurrent events in SQL

Building on this premise:

Postgres 9.2 or older

SELECT id
     , stime
     , etime
FROM   timesheet_entries t
WHERE  etime <= stime::date + 1  -- this includes upper border 00:00

UNION ALL
SELECT id
     , CASE WHEN stime::date = d THEN stime ELSE d END     -- AS stime
     , CASE WHEN etime::date = d THEN etime ELSE d + 1 END -- AS etime
FROM (
   SELECT id
        , stime
        , etime
        , generate_series(stime::date, etime::date, interval '1d')::date AS d
   FROM   timesheet_entries t
   WHERE  etime > stime::date + 1
   ) sub
ORDER  BY id, stime;

Or simply:

SELECT id
     , CASE WHEN stime::date = d THEN stime ELSE d END     -- AS stime
     , CASE WHEN etime::date = d THEN etime ELSE d + 1 END -- AS etime
FROM (
   SELECT id
        , stime
        , etime
        , generate_series(stime::date, etime::date, interval '1d')::date AS d
   FROM   timesheet_entries t
   ) sub
ORDER  BY id, stime;

The simpler one may even be faster.
Note a corner case difference when stime and etime both fall on 00:00 exactly. Then a row with a zero time range is added at the end. There are various ways to deal with that. I propose:

SELECT *
FROM  (
   SELECT id
        , CASE WHEN stime::date = d THEN stime ELSE d END     AS stime
        , CASE WHEN etime::date = d THEN etime ELSE d + 1 END AS etime
   FROM (
      SELECT id
           , stime
           , etime
           , generate_series(stime::date, etime::date, interval '1d')::date AS d
      FROM   timesheet_entries t
      ) sub1
   ORDER  BY id, stime
   ) sub2
WHERE  etime <> stime;

Postgres 9.3+

In Postgres 9.3+ you would better use LATERAL for this

SELECT id
     , CASE WHEN stime::date = d THEN stime ELSE d END     AS stime
     , CASE WHEN etime::date = d THEN etime ELSE d + 1 END AS etime
FROM   timesheet_entries t
     , LATERAL (SELECT d::date
                FROM   generate_series(t.stime::date, t.etime::date, interval '1d') d
                ) d
ORDER  BY id, stime;

Details in the manual.
Same corner case as above.

SQL Fiddle demonstrating all.

answered Oct 12 '22 20:10

Erwin Brandstetter

There is simply solution (if intervals starts in same time)

postgres=# select i, i + interval '1day' - interval '1sec' 
  from generate_series('2013-01-01 00:00:00'::timestamp, '2013-01-02 23:59:59', '1day') g(i);
          i          │      ?column?       
─────────────────────┼─────────────────────
 2013-01-01 00:00:00 │ 2013-01-01 23:59:59
 2013-01-02 00:00:00 │ 2013-01-02 23:59:59
(2 rows)

I wrote a table function, that do it for any interval. It is fast - two years range divide to 753 ranges in 10ms

create or replace function day_ranges(timestamp, timestamp)
returns table(t1 timestamp, t2 timestamp) as $$
begin
  t1 := $1;
  if $2 > $1 then
    loop
      if t1::date = $2::date then
        t2 := $2;
        return next;
        exit;
      end if;
      t2 := date_trunc('day', t1) + interval '1day' - interval '1sec';
      return next;
      t1 := t2 + interval '1sec';
    end loop;
  end if;
  return;
end;
$$ language plpgsql;

Result:

postgres=# select * from day_ranges('2013-10-08 22:00:00', '2013-10-10 23:00:00');
         t1          │         t2          
─────────────────────┼─────────────────────
 2013-10-08 22:00:00 │ 2013-10-09 23:59:59
 2013-10-09 00:00:00 │ 2013-10-09 23:59:59
 2013-10-10 00:00:00 │ 2013-10-10 23:00:00
(3 rows)

Time: 6.794 ms

and faster (and little bit longer) version based on RETURN QUERY

create or replace function day_ranges(timestamp, timestamp)
returns table(t1 timestamp, t2 timestamp) as $$
begin
  t1 := $1; t2 := $2;
  if $1::date = $2::date then
    return next;
  else
    -- first day
    t2 := date_trunc('day', t1) + interval '1day' - interval '1sec';
    return next;
    if $2::date > $1::date + 1 then
      return query select d, d + interval '1day' - interval '1sec'
                      from generate_series(date_trunc('day', $1 + interval '1day')::timestamp,
                                           date_trunc('day', $2 - interval '1day')::timestamp,
                                           '1day') g(d);
    end if;
    -- last day 
    t1 := date_trunc('day', $2); t2 := $2;
    return next;
  end if;
  return;
end;
$$ language plpgsql;

answered Oct 12 '22 21:10

Pavel Stehule

Related questions
                            
                                Calling SQL Stored Procedure with Output Parameter in VBScript
                            
                                Casting String to Money Value in SQL Server
                            
                                How often do Update triggers fire on a multi-record update?
                            
                                Calculating the size of a column type in Postgresql
                            
                                Mysql query to find sum of fields with same column value
                            
                                Select rows with same column A but different column B
                            
                                Storing passwords in mysql... use a hash right? but how do you send the user a forgotten password?
                            
                                Selecting contiguous block of records in mysql
                            
                                select distinct column field and total sum for it in mysql
                            
                                Computational Complexity of SQL Query
                            
                                C# - ExecuteNonQuery requires an open and available Connection. The connection's current state is closed
                            
                                MySQL JOIN vs MySQL Shorthand JOIN(,)
                            
                                When is the query in a cursor executed?
                            
                                Operand data type void type is invalid for sum operator
                            
                                SUM values from SQL column from same ID
                            
                                How to use Table Variable in Dynamic Query
                            
                                Getting id after insert within a transaction (Oracle)
                            
                                SQL Query to truncate to two decimal points?
                            
                                composing clojure honeysql where clause
                            
                                SELECT DISTINCT values and INSERT INTO table

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With