I have a table with a primary key (bigint), datetime, value, foreignKey to configuration tabel that consists of 100,000's of rows. I want to be able to obtain a row for a variable time interval. For example.
Select Timestamp, value from myTable where configID=3
AND{most recent for 15 min interval}
I have a CTE query that returns multiple rows for the interval interval
WITH Time_Interval(timestamp, value, minutes)
AS
(
Select timestamp, value, DatePart(Minute, Timestamp) from myTable
Where Timestamp >= '12/01/2012' and Timestamp <= 'Jan 10, 2013' and
ConfigID = 435 and (DatePart(Minute, Timestamp) % 15) = 0
)
Select Timestamp, value, minutes from Time_Interval
group by minutes, value, timestamp
order by Timestamp
such as:
2012-12-19 18:15:22.040 6.98 15
2012-12-19 18:15:29.887 6.98 15
2012-12-19 18:15:33.480 7.02 15
2012-12-19 18:15:49.370 7.01 15
2012-12-19 18:30:41.920 6.95 30
2012-12-19 18:30:52.437 6.93 30
2012-12-19 19:15:18.467 7.13 15
2012-12-19 19:15:34.250 7.11 15
2012-12-19 19:15:49.813 7.12 15
But as can be seen there are 4 for the 1st 15 minute interval, 2 for the next interval, etc... Worse, If no data was obtain at an exact times stamp of 15 minutes, then there will be no value.
What I want is the most recent value for a fifteen minute interval... if if the only data for that intervall occurred at 1 second after the start of the interval.
I was thinking of Lead/over but again... the rows are not orgainzed that way. Primary Key is a bigInt and is a clustered Index. Both the timstamp column and ConfigID columns are Indexed. The above query returns 4583 rows in under a second.
Thanks for any help.
To select every nth row of a DataFrame - we will use the slicing method. Slicing in pandas DataFrame is similar to slicing a list or a string in python. Suppose we want every 2nd row of DataFrame we will use slicing in which we will define 2 after two :: (colons).
Here's the SQL query to select every nth row in MySQL. mysql> select * from table_name where table_name.id mod n = 0; In the above query, we basically select every row whose id mod n value evaluates to zero.
ROW_NUMBER (Window Function) ROW_NUMBER (Window Function) is a standard way of selecting the nth row of a table. It is supported by all the major databases like MySQL, SQL Server, Oracle, PostgreSQL, SQLite, etc.
Try this on for size. It will even handle returning one row for instances when you have multiple timestamps for a given interval. NOTE: This assumes your Bigint PK column is named: idx. Just substitute where you see "idx" if it is not.
;WITH Interval_Helper([minute],minute_group)
AS
(
SELECT 0, 1 UNION SELECT 1, 1 UNION SELECT 2, 1 UNION SELECT 3, 1 UNION SELECT 4, 1
UNION SELECT 5, 1 UNION SELECT 6, 1 UNION SELECT 7, 1 UNION SELECT 8, 1 UNION SELECT 9, 1
UNION SELECT 10, 1 UNION SELECT 11, 1 UNION SELECT 12, 1 UNION SELECT 13, 1 UNION SELECT 14, 1
UNION SELECT 15, 2 UNION SELECT 16, 2 UNION SELECT 17, 2 UNION SELECT 18, 2 UNION SELECT 19, 2
UNION SELECT 20, 2 UNION SELECT 21, 2 UNION SELECT 22, 2 UNION SELECT 23, 2 UNION SELECT 24, 2
UNION SELECT 25, 2 UNION SELECT 26, 2 UNION SELECT 27, 2 UNION SELECT 28, 2 UNION SELECT 29, 2
UNION SELECT 30, 3 UNION SELECT 31, 3 UNION SELECT 32, 3 UNION SELECT 33, 3 UNION SELECT 34, 3
UNION SELECT 35, 3 UNION SELECT 36, 3 UNION SELECT 37, 3 UNION SELECT 38, 3 UNION SELECT 39, 3
UNION SELECT 40, 3 UNION SELECT 41, 3 UNION SELECT 42, 3 UNION SELECT 43, 3 UNION SELECT 44, 3
UNION SELECT 45, 4 UNION SELECT 46, 4 UNION SELECT 47, 4 UNION SELECT 48, 4 UNION SELECT 49, 4
UNION SELECT 50, 4 UNION SELECT 51, 4 UNION SELECT 52, 4 UNION SELECT 53, 4 UNION SELECT 54, 4
UNION SELECT 55, 4 UNION SELECT 56, 4 UNION SELECT 57, 4 UNION SELECT 58, 4 UNION SELECT 59, 4
)
,Time_Interval([timestamp], value, [date], [hour], minute_group)
AS
(
SELECT A.[Timestamp]
,A.value
,CONVERT(smalldatetime, CONVERT(char(10), A.[Timestamp], 101))
,DATEPART(HOUR, A.[Timestamp])
,B.minute_group
FROM myTable A
JOIN Interval_Helper B
ON (DATEPART(minute, A.[Timestamp])) = B.[minute]
AND A.[Timestamp] >= '12/01/2012'
AND A.[Timestamp] <= '01/10/2013'
AND A.ConfigID = 435
)
,Time_Interval_TimeGroup([date], [hour], [minute], MaxTimestamp)
AS
(
SELECT [date]
,[hour]
,minute_group
,MAX([Timestamp]) as MaxTimestamp
FROM Time_Interval
GROUP BY [date]
,[hour]
,minute_group
)
,Time_Interval_TimeGroup_Latest(MaxTimestamp, MaxIdx)
AS
(
SELECT MaxTimestamp
,MAX(idx) as MaxIdx
FROM myTable A
JOIN Time_Interval_TimeGroup B
ON A.[Timestamp] = B.MaxTimestamp
GROUP BY MaxTimestamp
)
SELECT A.*
FROM myTable A
JOIN Time_Interval_TimeGroup_Latest B
ON A.idx = B.MaxIdx
ORDER BY A.[timestamp]
This is another take on the clever time group function from @MntManChris below:
CREATE FUNCTION dbo.fGetTimeGroup (@DatePart tinyint, @Date datetime)
RETURNS int
AS
BEGIN
RETURN CASE @DatePart
WHEN 1 THEN DATEPART(mi, @Date)
WHEN 2 THEN DATEPART(mi, @Date)/5 + 1 -- 5 min
WHEN 3 THEN DATEPART(mi, @Date)/15 + 1 -- 15 min
WHEN 4 THEN DATEPART(mi, @Date)/30 + 1 -- 30 min
WHEN 5 THEN DATEPART(hh, @Date) -- hr
WHEN 6 THEN DATEPART(hh, @Date)/6 + 1 -- 6 hours
WHEN 7 THEN DATEPART(hh, @Date)/12 + 1 -- 12 hours
WHEN 8 THEN DATEPART(d, @Date) -- day
ELSE -1
END
END
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With