Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Window based averages based on ranges within another table

I have a table that has positions like so:

create or replace table data (
  pos int not null,
  val float not null,
  constraint data_pk primary key (pos)
);

And, a ranges type table like so:

create or replace table ranges (
  label varchar(32) not null,
  left int not null,
  right int not null,
  constraint ranges_pk primary key (label)
);

with ranges like

        ('alpha', 11, 13),
        ('bravo', 11, 14),
        ('charlie', 11, 15),
        ('echo', 12, 18),
        ('delta', 12, 19),
        ('foxtrot', 13, 20)

for each label, I need to lookup every possible 3 subrange within the "data" table, take these 3 subrange averages, and then average them...

I couldn't think of a good way to describe what I'm after, so I thought I'd show what I'd expect for 'charlie':

The results for charlie in the select should be:

    ('charlie', 40.111), -- avg(avg(data[pos=11], data[pos=12], data[pos=13]), avg(data[pos=12], data[pos=13], data[pos=14]), avg(data[pos=13], data[pos=14], data[pos=15]))
                         -- -> avg(avg(31, 37, 41), avg(37, 41, 43), avg(41, 43, 47))
                         -- -> avg(36.333, 40.333, 43.667) -> 40.111

(for data like)

insert into data (pos, val) values
    (1, 2), (2, 3), (3, 5), (4, 7), (5, 11), (6, 13), (7, 17), (8, 19),
    (9, 23), (10, 29), (11, 31), (12, 37), (13, 41), (14, 43), (15, 47), (16, 53),
    (17, 59), (18, 61), (19, 67), (20, 71), (21, 73), (22, 79), (23, 83), (24, 89),
    (25, 97), (26, 101), (27, 103), (28, 107), (29, 109), (30, 113), (31, 127), (32, 131),
    (33, 137), (34, 139), (35, 149), (36, 151), (37, 157), (38, 163), (39, 167), (40, 173),
    (41, 179), (42, 181), (43, 191), (44, 193), (45, 197), (46, 199), (47, 211), (48, 223),
    (49, 227), (50, 229), (51, 233), (52, 239), (53, 241), (54, 251);

Is there a way to do this within Snowflake SQL? Or must I resort to python to do this? If it helps I made a gist with more data..

Thanks!

like image 254
bnsh Avatar asked Apr 18 '26 11:04

bnsh


1 Answers

Is there a way to do this within Snowflake SQL? Or must I resort to python to do this?

SQL language is expressive enough to handle such case.

Key point here is to use windowed average with windows size of 3 and then average moving averages:

WITH cte AS (
 SELECT r.label, r.left, r.right, d.val, 
        AVG(d.val) OVER(PARTITION BY r.label ORDER BY d.pos ROWS 
                        BETWEEN 2 PRECEDING AND CURRENT ROW) AS r
 FROM ranges r
 JOIN data d
   ON d.pos BETWEEN r.left AND r.right
  QUALIFY ROW_NUMBER() OVER(PARTITION BY r.label ORDER BY d.pos) > 2 
)
SELECT label, AVG(r) AS output
FROM cte
GROUP BY label
ORDER BY label;

Output:

enter image description here


Intermediate step to ilustrate:

WITH cte AS (
 SELECT r.label, r.left, r.right, d.val,
        AVG(d.val) OVER(PARTITION BY r.label ORDER BY d.pos 
                        ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS r
 FROM ranges r
 JOIN data d
   ON d.pos BETWEEN r.left AND r.right
  QUALIFY ROW_NUMBER() OVER(PARTITION BY r.label ORDER BY d.pos) > 2 
)
SELECT *
FROM cte
ORDER BY label, r;

Output:

enter image description here

like image 199
Lukasz Szozda Avatar answered Apr 21 '26 13:04

Lukasz Szozda