Rolling joins (LOCF) in Postgres

Question

Over time, I have gotten used to the incredibly useful functionality of data.table's rolling joins in r. These utilize the operation of LOCF (last observation carried forward). Unfortunately, I am forced to work in a environment that I am much less familiar with (using postgres). Is there an analogous operation in SQL (specifically postgres)?

Here is an example of what I have and the output I would like:

Here is my first table

dt1 = data.table(Date=seq(from=as.Date("2013-01-03"),
         to=as.Date("2013-06-27"), by="1 day"),key="Date")[, ind:=.I]

           Date ind
  1: 2013-01-03   1
  2: 2013-01-04   2
  3: 2013-01-05   3
  4: 2013-01-06   4
  5: 2013-01-07   5
 ---               
172: 2013-06-23 172
173: 2013-06-24 173
174: 2013-06-25 174
175: 2013-06-26 175
176: 2013-06-27 176

Here is my second table

dt2 = data.table(Date=seq(from=as.Date("2013-01-01"),
         to=as.Date("2013-06-30"), by="1 week"),key="Date")

          Date
 1: 2013-01-01
 2: 2013-01-08
 3: 2013-01-15
 4: 2013-01-22
 5: 2013-01-29
 ---
22: 2013-05-28
23: 2013-06-04
24: 2013-06-11
25: 2013-06-18
26: 2013-06-25

Here is the code I would use in data.table for the desired output.

dt1[dt2, roll=Inf]

          Date ind
 1: 2013-01-01  NA
 2: 2013-01-08   6
 3: 2013-01-15  13
 4: 2013-01-22  20
 5: 2013-01-29  27
---
22: 2013-05-28 146
23: 2013-06-04 153
24: 2013-06-11 160
25: 2013-06-18 167
26: 2013-06-25 174

Is this even possible using postgres (or more generally speaking, SQL? Thank you so much for any help you can provide.

Clayton Stanley · Accepted Answer

I'd really be interested to see if someone can do this without populating the full cross-join table first. But here's a solution with cross join:

http://sqlfiddle.com/#!2/b2f3f/3/0

Create the schema:

CREATE TABLE Table1
    (`t1` double, `ind` int)
;

INSERT INTO Table1
    (`t1`, `ind`)
VALUES
    (1, 1),
    (1.9, 2),
    (3.1, 3),
    (4, 4),
    (5.1, 5),
    (5.9, 6)
;

CREATE TABLE Table2
    (`t2` int)
;

INSERT INTO Table2
    (`t2`)
VALUES
    (1),
    (2),
    (3),
    (4),
    (5),
    (6)
;

Query:

select t2, max(ind)
from (select t2, ind
      from table1
      cross join table2
      where t1 <= t2) as foo
group by t2

Result:

T2  MAX(IND)
1   1
2   2
3   2
4   4
5   4
6   6

EDIT: @Hadley's comment is correct, that the full cross join table is never materialized using the query above, as the query above produces the same explain and results as the query below:

select t2, max(ind)
from table1
cross join table2
where t1 <= t2
group by t2

Mike.Gahan · Answer

Since I am working in a special Postgres environment, apparently it does not allow me to cross join. Thanks so much to @Clayton Stanley for the great answer, but I had to try another route. It looks to be working so far. I apologize for not being able to perform time comparisons of both methods.

Create the schema

CREATE TABLE Table1
    (`id` int,`t1` double, `ind` int)
;

INSERT INTO Table1
    (`id`,`t1`, `ind`)
VALUES
    (1,0.99, 5),
    (1,1.90, 10),
    (2,3.10, 12),
    (2,4.00, 3),
    (3,5.10, 8),
    (3,5.90, 16),
    (4,5.90, 7),
    (4,5.99, 20)
;



CREATE TABLE Table2
    (`id` int, `t2` double)
;

INSERT INTO Table2
    (`id`,`t2`)
VALUES
    (1,1.00),
    (2,3.95),
    (3,5.05),
    (4,6.01)
;

Perform the join using subqueries

select B.*
from Table2 as A
join Table1 as B
on B.id=A.id
join(
    select 
       SUBB.id,
       max(SUBB.t1) as t1
    from Table2 as SUBA
    join Table1 as SUBB
    on SUBB.id=SUBA.id and
    SUBB.t1 <= SUBA.t2
    group by SUBB.id
    )
as subqry
on B.t1=subqry.t1  and
   A.id=subqry.id

The example schema and output is here:

Link to schema

Rolling joins (LOCF) in Postgres

Tags:

r

postgresql

data.table

Mike.Gahan

Video Answer

2 Answers

Clayton Stanley

Mike.Gahan

Recent Activity

Donate For Us

Rolling joins (LOCF) in Postgres

Tags:

r

postgresql

data.table

Mike.Gahan

Video Answer

2 Answers

Clayton Stanley

Mike.Gahan

Related questions

Recent Activity

Donate For Us