I receive a daily txt file of data from multiple antennas. The naming convention for the files is:
Unique Antenna ID + year + month + day + Random 3 digit number
I parsed the file names and created a table like this one:
AntennaID fileyear filemonth fileday filenumber filename
0000 2016 09 22 459 000020160922459.txt
0000 2016 09 21 981 000020160921981.txt
0000 2016 09 20 762 000020160920762.txt
0001 2016 09 22 635 000120160922635.txt
.
.
.
etc. (200k rows)
Sometimes the antennas send either more than 1 file or no file at all. The unique 3 digit filenumber distinguishes the files if more than 1 gets sent, however I'm trying to find the days when a file wasn't sent.
I've tried a couple groupby statements to compare the amount of datafiles in a given month and to see if it matches up w/ the days in that month - but the problem is sometimes the antennas send more than 1 file per day which could artificially make up for a "missing" file if we're just comparing the count.
I'm seeking a more robust method for finding the dates or date ranges for the missing files. I've looked into the Partition and Over functions and feel as there might be potential there, but I'm unsure as to how to use them as I'm fairly new to SQL.
I'm using Microsoft SQL Server 2016
You can use a common table expression (or cte
for short) to create a table of dates. You can then join
from this table to your Antenna data and look for dates that return a null
value:
declare @MinDate date = getdate()-50
declare @MaxDate date = getdate()
;with Dates as
(
select @MinDate as DateValue
union all
select dateadd(d,1,DateValue)
from Dates
where DateValue < @MaxDate
)
select d.DateValue
from Dates d
left join AntennaData a
on(d.DateValue = cast(cast(a.fileyear as nvarchar(4)) + cast(a.filemonth as nvarchar(4)) + cast(a.fileday as nvarchar(4)) as date))
option (maxrecursion 0)
Whilst the recursive CTE will generate the list of dates, it is not the most efficient way to do so. If speed is important for you, use a set based tally table instead:
declare @MinDate date = getdate()-50;
declare @MaxDate date = getdate();
-- Generate table with 10 rows
with t(t) as (select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1 union all select 1)
-- Add row numbers (-1 to start at adding 0 to retain @MinDate value) based on tally table to @MinDate for the number of days +1 (to ensure Min and Max date are included) between the two dates
,d(d) as (select top(datediff(day, @MinDate, @MaxDate)+1) dateadd(day,row_number() over (order by (select null))-1,@MinDate)
from t t1,t t2,t t3,t t4,t t5,t t6 -- Cross join creates 10^6 or 10*10*10*10*10*10 = 1,000,000 row table
)
select *
from d;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With