Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Equivalent for Keep in Snowflake

I am trying to convert a oracle code to snowflake,

In Oracle:

MIN(salary) KEEP (DENSE_RANK FIRST ORDER BY commission_pct) "Worst",

What would be corresponding thing as We dont have "keep" in oracle

like image 656
asd Avatar asked Oct 13 '25 03:10

asd


2 Answers

so using this CTE for the example data:

WITH data AS (  
    SELECT * FROM VALUES 
        ('a', 2300, 10.1), 
        ('a',4000, 28.7), 
        ('b', 3000, 90.0) 
    AS v(dept, salary, commission_pct)  
)

And applying Gordon's code:

SELECT dept
    ,MIN(CASE WHEN seqnum = 1 THEN salary end) AS worst
FROM (SELECT t.*,
             ROW_NUMBER() OVER (PARTITION BY dept ORDER BY commission_pct) AS seqnum
      FROM data AS t
) 
GROUP BY 1 ORDER BY 1; 

We get the first item from each department (dept) based on commission_pct, and then we take the MIN of those values.

2300

If we remove the Min and thus have

WITH data AS (  
    SELECT * FROM values 
        ('a', 2300, 10.1), 
        ('a',4000, 28.7), 
        ('b', 3000, 90.0) 
    AS v(dept, salary, commission_pct)  
)
SELECT 
    CASE WHEN seqnum = 1 THEN salary END AS worst
FROM (SELECT t.*,
             ROW_NUMBER() OVER (PARTITION BY dept ORDER BY commission_pct) as seqnum
      FROM data AS t
) ;

we get

WORST
2300
NULL
3000

So the difference is in the QUALIFY case, the non-first rows are not actually returned. Thus are not accessible for other opperations.

WITH data AS (  
    SELECT * FROM VALUES 
        ('a', 2300, 10.1), 
        ('a',4000, 28.7), 
        ('b', 3000, 90.0) 
    AS v(dept, salary, commission_pct)  
)
SELECT 
    salary AS worst
FROM data 
QUALIFY rOW_NUMBER() OVER (PARTITION BY dept ORDER BY commission_pct) = 1
;

just gives:

WORST
2300
3000

But Snowflake does have FIRST_VALUE, thus the effect of the KEEP

WITH data AS (  
    SELECT * FROM VALUES 
        ('a', 2300, 10.1), 
        ('a',4000, 28.7), 
        ('b', 3000, 90.0) 
    AS v(dept, salary, commission_pct)  
)
SELECT t.*
    ,first_value(salary) OVER (PARTITION BY dept ORDER BY commission_pct) as same_as_keep
FROM data AS t
;

giving:

DEPT    SALARY    COMMISSION_PCT  SAME_AS_KEEP
a       2300      10.1            2300
a       4000      28.7            2300
b       3000      90.0            3000

thus you (with some a sub-select required to disambiuate the double WINDOW functions)

WITH data AS (  
    SELECT * FROM VALUES 
        ('a', 2300, 10.1), 
        ('a',4000, 28.7), 
        ('b', 3000, 90.0) 
    AS v(dept, salary, commission_pct)  
)
SELECT q.*,
    min(same_as_keep) over (partition by true) as worst
FROM (
    SELECT t.*
        ,first_value(salary) OVER (PARTITION BY dept ORDER BY commission_pct) as same_as_keep
    FROM data AS t
  ) AS q
;

giving:

DEPT    SALARY    COMMISSION_PCT    SAME_AS_KEEP    WORST
a       2300      10.1              2300            2300
a       4000      28.7              2300            2300
b       3000      90.0              3000            2300

But like a lot of things it all depends how you are using KEEP to what asspect of it's behavour you are wanting.

For example I do not know if if you swap MIN for COUNT if KEEP for this example data would give you 2 which is like Gordon's CASE version or if it gives you 3 which mean it behaves like FIRST VALUE.

like image 194
Simeon Pilgrim Avatar answered Oct 14 '25 17:10

Simeon Pilgrim


It is possible to achieve the result using MIN_BY/MAX_BY:

Finds the row(s) containing the minimum or maximum value for a column and returns the value of another column in that row.

MIN(salary) KEEP (DENSE_RANK FIRST ORDER BY commission_pct) AS Worst

Is the equivalent of:

MIN_BY(salary, commission_pct)
like image 41
Lukasz Szozda Avatar answered Oct 14 '25 19:10

Lukasz Szozda