Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Filling in NULLS with previous records - Netezza SQL

I am using Netezza SQL on Aginity Workbench and have the following data:

id           DATE1              DATE2
1            2013-07-27         NULL
2            NULL               NULL
3            NULL               2013-08-02
4            2013-09-10         2013-09-23
5            2013-12-11         NULL
6            NULL               2013-12-19

I need to fill in all the NULL values in DATE1 with preceding values in the DATE1 field that are filled in. With DATE2, I need to do the same, but in reverse order. So my desired output would be the following:

id           DATE1              DATE2
1            2013-07-27         2013-08-02
2            2013-07-27         2013-08-02
3            2013-07-27         2013-08-02
4            2013-09-10         2013-09-23
5            2013-12-11         2013-12-19
6            2013-12-11         2013-12-19

I only have read access to the data. So creating Tables or views are out of the question

like image 275
crayfishcray Avatar asked Mar 15 '23 02:03

crayfishcray


2 Answers

How about this?

select
  id
  ,last_value(date1 ignore nulls) over (
    order by id
    rows between unbounded preceding and current row
  ) date1
  ,first_value(date2 ignore nulls) over (
    order by id
    rows between current row and unbounded following
  ) date2

You can manually calculate this as well, rather than relying on the windowing functions.

with chain as (
  select 
    this.*,
    prev.date1 prev_date1,
    case when prev.date1 is not null then abs(this.id - prev.id) else null end prev_distance,
    next.date2 next_date2,
    case when next.date2 is not null then abs(this.id - next.id) else null end next_distance
  from 
    Table1 this 
    left outer join Table1 prev on this.id >= prev.id
    left outer join Table1 next on this.id <= next.id
), min_distance as (
  select
    id,
    min(prev_distance) min_prev_distance,
    min(next_distance) min_next_distance
  from
    chain
  group by
    id
)
select
  chain.id,
  chain.prev_date1,
  chain.next_date2
from
  chain
  join min_distance on 
    min_distance.id = chain.id
    and chain.prev_distance = min_distance.min_prev_distance
    and chain.next_distance = min_distance.min_next_distance
order by chain.id

If you're unable to calculate the distance between IDs by subtraction, just replace the ordering scheme by a row_number() call.

like image 61
Jeremy Fortune Avatar answered Apr 01 '23 23:04

Jeremy Fortune


I think Netezza supports the order by clause for max() and min(). So, you can do:

select max(date1) over (order by date1) as date1,
       min(date2) over (order by date2 desc) as date2
 . . .

EDIT:

In Netezza, you may be able to do this with last_value() and first_value():

select last_value(date1 ignore nulls) over (order by id rows between unbounded preceding and 1 preceding) as date1,
       first_value(date1 ignore nulls) over (order by id rows between 1 following and unbounded following) as date2

Netezza doesn't seem to support IGNORE NULLs on LAG(), but it does on these functions.

like image 20
Gordon Linoff Avatar answered Apr 01 '23 22:04

Gordon Linoff