SQL Server : time-series data performance

Tags:

I have a table of a little over 1 billion rows of time-series data with fantastic insert performance but (sometimes) awful select performance.

Table tblTrendDetails (PK is ordered as shown):

PK  TrendTime    datetime
PK  CavityId     int
PK  TrendValueId int
    TrendValue   real

The table is continuously pulling in new data and purging old data, so insert and delete performance needs to remain snappy.

When executing a query such as the following, performance is poor (30 sec):

SELECT * 
FROM tblTrendDetails
WHERE TrendTime BETWEEN @inMinTime AND @inMaxTime
  AND CavityId = @inCavityId
  AND TrendValueId = @inTrendId

If I execute the same query again (with similar times, but any @inCavityId or @inTrendId), performance is very good (1 sec). Performance counters show that disk access is the culprit the first time the query is run.

Any recommendations regarding how to improve performance without (significantly) adversely affecting the insert or delete performance? Any suggestions (including completely changing the underlying database) are welcome.

383

asked Oct 23 '12 19:10

pilotcam

1 Answers

The fact that subsequent queries of the same or similar data run much faster is probably due to SQL Server caching your data. That said, is it possible to speed this initial query up?

Verify the query plan:

My guess is that your query should result in an Index Seek rather than an Index Scan (or worse, a Table Scan). Please verify this using SET SHOWPLAN_TEXT ON; or a similar feature. Using between and = as your query does should really take advantage of the clustered index, though that's debatable.

Index Fragmentation:

It is possible that your clustered index (the primary key in this case) is quite fragmented after all of those inserts and deletes. I would probably check this with DBCC SHOWCONTIG (tblTrendDetails).

You can defrag the table's indexes with DBCC INDEXDEFRAG (MyDatabase, tblTrendDetails). This may take some time, but will allow the table to remain accessible, and you can stop the operation without any nasty side-effects.

You might have to go further and use DBCC DBREINDEX (tblTrendDetails). This is an offline operation, though, so you should only do this when the table does not need to be accessed.

There are some differences described here: Microsoft SQL Server 2000 Index Defragmentation Best Practices.

Be aware that your transaction log can grow quite a bit from defragging a large table, and it can take a long time.

Partitioned Views:

If these do not remedy the situation (or fragmentation is not a problem), you may even wish to look to partitioned views, in which you create a bunch of underlying base tables for various ranges of records, then union them all up in a view (replacing your original table).

Better Stuff:

If performance of these selects is a real business need, you may be able to make the case for better hardware: faster drives, more memory, etc. If your drives are twice as fast, then this query will run in half the time, yeah? Also, this may not be workable for you, but I've simply found newer versions of SQL Server to truly be faster with more options and better to maintain. I'm glad to have moved most of my company's data to 2008R2. But I digress...

182

answered Sep 29 '22 07:09

Tim Lehner

Related questions
                            
                                Which SQL Server sql data type to use in order to preserve UTC date time
                            
                                Where are SQL Server user credentials stored?
                            
                                find all tables not referenced in stored procedures
                            
                                SQL Server 2008: Getting deadlocks... without any locks
                            
                                SQL Server 2008 R2 - select hierarchical data
                            
                                varbinary to literal varchar copy in SQL Server 2005
                            
                                pyodbc returns SQL Server DATE fields as strings
                            
                                SQL Server: Endless WHILE EXISTS loop
                            
                                TSQL Cursor how to check if already declared and thus deallocate
                            
                                How to copy rows with SQL and get new and old IDs as result?
                            
                                Create MVC4 Membership Database
                            
                                SSAS -> How do you hide an attribute that is used in a hierarchy? (without using perspectives)
                            
                                Is there a standard for storing and parsing strings for calendar date recurrence?
                            
                                SQL Server Column names case sensitivity
                            
                                Is TimeStamp column unique?
                            
                                SQL Server Identity columns with NHibernate - to use or not to use
                            
                                How to add new Dimension Attributes to the existing dimension through BIDS
                            
                                SQL update using values from the same record
                            
                                Extract value returned from dynamic SQL
                            
                                Calling one stored procedure within another stored procedure using variables from first stored procedure

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

SQL Server : time-series data performance

Tags:

performance

sql-server

sql-server-2000

time-series

pilotcam

People also ask

1 Answers

Tim Lehner

Recent Activity

Donate For Us