I'm having serious performance problems with a particular SQL row-to-column translation. A single query takes up to a minute to pull up 6 hours of data. I've simplified the tables below for clarity, but there's well over a million datapoints per day so the table size might be contributing. I don't have much experience with complex queries like this so I'd like some advice. The data tables are as follows...
(DataPoint table)
id datetime
_____________________
1 2015-09-08 21:00:00
2 2015-09-08 21:00:01
3 2015-09-08 21:00:02
4 2015-09-08 21:00:03
5 2015-09-08 21:00:04
(SensorData table)
id datapointId SensorId Temp DateTime
_____________________________________________________
1 1 20 34.6 2015-09-08 21:00:00.345
2 1 21 34.2 2015-09-08 21:00:00.551
3 1 22 34.1 2015-09-08 21:00:00.101
4 1 41 34.3 2015-09-08 21:00:00.700
5 2 20 34.2 2015-09-08 21:00:01.223
6 2 21 34.4 2015-09-08 21:00:01.456
7 2 22 34.5 2015-09-08 21:00:01.100
8 2 41 34.6 2015-09-08 21:00:01.870
Note that that individual sensor DateTime and the DataPoint DateTime are not equal (but they are close). This is why I went with a second table to give me the ability to do a rough side-by-side comparison without having to do some fuzzy dateTime logic.
I need to query this information with a resulting dataset that looks like this...
DateTime Temp1 Temp2 Temp3
__________________________________________________
2015-09-08 21:00:00 34.5 34.1 41.1
2015-09-08 21:00:01 34.4 34.2 41.2
2015-09-08 21:00:02 34.4 34.2 41.2
My current SQL query is written (dynamically) as such...
SELECT DataPoint.DateTime,
max(case when SensorData.SensorId = 20 then SensorData.Temp end) Temp1,
max(case when SensorData.SensorId = 21 then SensorData.Temp end) Temp2,
max(case when SensorData.SensorId = 22 then SensorData.Temp end) Temp3,
FROM DataPoint LEFT JOIN SensorData ON DataPoint.Id=dbo.SensorData.DataPointId
WHERE DataPoint.DateTime BETWEEN [x] and [y]
GROUP BY DataPoint.DateTime ORDER BY DataPoint.DateTime
So my question has three parts:
1) Why is this particular query so slow?
2) Is there a better method for storing this information that I'm missing? I'm still in the design phase at this point. I chose this schema because I need to be able to compare timeseries information for sensors that fire off their data at irregular intervals.
3) Is there a faster way to query and translate this data to my desired format?
EDIT!!! Sorry, there is a GroupBy clause at the end of my query that I forgot to add. My bad.
First you need make sure have index for DataPointId
, if that is a PK
probably have index, but if is a FK
you will have to manually add it.
Second you need index for SensorId
and DateTime
Third filter the SensorID
before the query right now you are processing all the million record to generate the report
.
WHERE SensorID IN (20,21,22)
Also you can try PIVOT
function
SqlFiddleDemo
SELECT [DateTime], [20] as Temp1, [21] as Temp2, [22] as Temp3
FROM
(SELECT [DateTime], SensorId, [Temp]
FROM sensor) AS SourceTable
PIVOT
(
MAX([Temp])
FOR SensorId IN ([20], [21], [22])
) AS PivotTable;
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With