Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stretch time series efficiently in SQL

Tags:

sql

postgresql

I would like to stretch time series to different length using SQL efficiently. Suppose I have the following data:

SQLFiddle (PostgreSQL)

-- drop table if exists time_series;

create table time_series (
  id serial,
  val numeric)
;

insert into time_series (val) values 
     (1), (2), (3), (4), (5), (6), 
     (5), (4), (3), (2), (1);

This time series has length 11 and I would like to stretch it to length 15 in such way that sum of values in stretched time series is the same as sum of values in the original time series. I have a solution which is not efficient:

select
  new_id,
  sum(new_val) as new_val
from
  (
    select 
      id, 
      val/15.0 as new_val,
      ceil(row_number() over(order by id, gs) / 11.0) as new_id
    from 
      time_series 
      cross join (select generate_series(1, 15) gs) gs 
  ) raw_data
group by
    new_id
order by
  new_id
;

This would first create a table with 15*11 rows and then collapsing it back into 15 rows.

While this works well for small time series, performance gets significantly worse with longer time-series. Given I would like to stretch 2,000 rows into 3,000, than the query has to generate 6M rows first (takes 30 seconds on my laptop).

Test data:

insert into time_series (val) select generate_series(1, 1000);
insert into time_series (val) select generate_series(1000, 1, -1);

Is there more efficient solution in SQL with same results?

like image 692
Tomas Greif Avatar asked Feb 19 '26 12:02

Tomas Greif


1 Answers

Try this query without cross join.

First we generate ts1 subquery with intervals of values then join it with a new sequence. And in the select list interpolate (linear) new ID to the joined interval of values - new_val.

Also in this query we use +1-1 to transform 1,2,3,... sequence to 0,1,2,....

select 
  gs as new_id,
  Sval+(Eval-SVal)*((gs.gs-1) /(100.0/(11.0-1))+1-ts1.ID) as new_val,
  SVal as StartInterval,
  EVal as EndInterval       
from 
  (Select generate_series(1, 100) gs) gs 
  left join
  (select T1.ID, T1.Val SVal,T2.Val EVal
     FROM
     time_series T1
     JOIN time_series T2 ON T1.Id=T2.ID-1) ts1 
   ON floor((gs.gs-1) /(100.0/(11.0-1)))+1=ts1.ID 
order by
gs
like image 106
valex Avatar answered Feb 22 '26 00:02

valex



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!