Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculate alarm flood in snowflake

Tags:

I am trying to do an alarm flood calculation in snowflake. I created the below dataset using a snowflake window function. So if the value is greater or equal to 3, then the alarm flood will start and for the next 0 value, it will end. So in the below example, the alarm flood started at "9:51' and ended at "9:54" which was for 3 minutes. The next flood started at "9:57" and ended at "10:02" which is for 5 minutes.FYI, value at 9:59 is 3, but as a flood is already started, we don't have to consider it. The next flood is at 10:03 but there is no 0 value, so we have to consider edge value 10:06. So total time in flood is 3+5+4= 12 minutes.

   DateTime    Value
3/10/2020 9:50  1
3/10/2020 9:51  3
3/10/2020 9:52  1
3/10/2020 9:53  2
3/10/2020 9:54  0
3/10/2020 9:55  0
3/10/2020 9:56  1
3/10/2020 9:57  3
3/10/2020 9:58  2
3/10/2020 9:59  3
3/10/2020 10:00 2
3/10/2020 10:01 2
3/10/2020 10:02 0
3/10/2020 10:03 3
3/10/2020 10:04 1
3/10/2020 10:05 1
3/10/2020 10:06 1

so, in short, I am expecting below output

enter image description here

I tried below SQL but it does not give me the correct output, it fails in second flood time (as there again value 3 before next 0)

select t.*,
       (case when value >= 3
             then datediff(minute,
                           datetime,
                           min(case when value = 0 then datetime end) over (order by datetime desc)
                          )
        end) as diff_minutes
from t;
like image 677
Dany Avatar asked Mar 24 '20 08:03

Dany


1 Answers

I'm not the most proud of this code, but it works and gives a starting place. I'm sure it can be cleaned up or simplified. and I haven't assessed performance for larger tables.

The key insight that I used is that if you add the date_diff to the date, then you can find situations where they both add to the same value, meaning that they are both counting to the same "0" record. Hopefully this concept is helpful if nothing else.

Also, the first cte is a semi-hacky way to get that 4 at the end of your results.

--Add a fake zero at the end of the table to provide a value for
-- comparing high values that have not been resolved
-- added a flag so this fake value can be removed later
with fakezero as
(
SELECT datetime, value, 1 flag
FROM test

UNION ALL

SELECT dateadd(minute, 1, max(datetime)) datetime, 0 value, 0 flag
FROM test  
)

-- Find date diffs between high values and subsequent low values
,diffs as (
select t.*,
       (case when value >= 3
             then datediff(minute,
                           datetime,
                           min(case when value = 0 then datetime end) over (order by datetime desc)
                          )
        end) as diff_minutes
from fakezero t
)

--Fix cases where two High values are "resolved" by the same low value
--i.e. when adding the date_diff to the datetime results in the same timestamp
-- this means that the prior high value record that still hasn't been "resolved"
select
  datetime
  ,value
  ,case when 
      lag(dateadd(minute, diff_minutes, datetime)) over(partition by value order by datetime)
      = dateadd(minute, diff_minutes, datetime)
    then null 
    else diff_minutes 
  end as diff_minutes
from diffs
where flag = 1
order by datetime;
like image 133
David Garrison Avatar answered Oct 02 '22 16:10

David Garrison