Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Snowflake bug in query(condition: "or" + "exists" + "<= + <=".)

query:

with q1 as (
  select 5 as dayid,1 as isok -- change 5 to 3 and query returns correct answer
  union all
  select 4 as dayid,0 as isok
)
,q2 as (
  select 4 as startday,4 as endday
  union all
  select 8 as startday,8 as endday
)
, r1 as (
  select 1 as id,* from q1 j
  where (j.isok=0)
  union all
  select 2 as id,* from q1 j
  where (exists (select 1 from q2 k where k.startday<=j.dayid and j.dayid<=k.endday))
)
, r2 as (
  select 3 as id,* from q1 j
  where (j.isok=0)
     or (exists (select 1 from q2 k where k.startday<=j.dayid and j.dayid<=k.endday))
)
select 2 as level,* from r2
union all
select 1 as level,* from r1
;

returns:

LEVEL ID DAYID ISOK
- - - -
2 3 5 1
2 3 4 0
1 1 4 0
1 2 4 0

4 row(s)

It should return:

 level | id | dayid | isok 
-------+----+-------+------
     2 |  3 |     4 |    0
     1 |  1 |     4 |    0
     1 |  2 |     4 |    0

(3 rows)

like that same query returns in postgres + sqlserver.

If you change select 5 to select 3 in q1, you get right result.

The problem is in query where you use "or" + "exists" + "<= + <=".

like image 840
Ismo Avatar asked Mar 13 '26 17:03

Ismo


1 Answers

It seems that the clue is query rewrite performed behind the scene. Using Simeon's example:

with q1(dayid, isok) as (
    SELECT * FROM VALUES
    (5,1), 
    (4,0)
),q2(startday, endday) as (
    SELECT * FROM VALUES
    (4,4), 
    (8,8)
)
select 3 as id,
    j.*,
    j.isok=0 as clause_1
from q1 as j
where clause_1 
 OR (exists (select 1 from q2 k where k.startday<=j.dayid and j.dayid<=k.endday))
;

db<>fiddle demo

Snowflake's output:

enter image description here


Running only exists condition(SEMI JOIN):

with q1(dayid, isok) as (
        SELECT * FROM VALUES
        (5,1), 
        (4,0)
),q2(startday, endday) as (
        SELECT * FROM VALUES
        (4,4), 
        (8,8)
)
select 3 as id,
        j.*,
        j.isok=0 as clause_1
from q1 as j
where (exists (select 1 from q2 k where k.startday<=j.dayid and j.dayid<=k.endday))
;

enter image description here

With OR applied:

with q1(dayid, isok) as (
        SELECT * FROM VALUES
        (5,1), 
        (4,0)
),q2(startday, endday) as (
        SELECT * FROM VALUES
        (4,4), 
        (8,8)
)
select 3 as id,
        j.*,
        j.isok=0 as clause_1
from q1 as j
where (exists (select 1 from q2 k where k.startday<=j.dayid and j.dayid<=k.endday)) 
  or clause_1
    ;

Here the query was executed as LEFT JOIN but this time on aggregated values from q2: MIN(col1), MAX(col2).

enter image description here

It casues that row 3 5 1 FALSE is added to the output.

like image 77
Lukasz Szozda Avatar answered Mar 16 '26 15:03

Lukasz Szozda