Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Forcing SQL Server to pre-cache entire database into memory

We have a client site with a 50Gb SQL 2012 database on a server with 100+ Gb of RAM.

As the application is used, SQL server does a great job of caching the db into memory but the performance increase from the caching occurs the SECOND time a query is run, not the first.

To try to maximize cache hits the first time queries are run, we wrote a proc that iterates through every index of every table within the entire DB, running this:

SELECT * INTO #Cache  FROM ' + @tablename + ' WITH (INDEX (' + @indexname + '))' 

In an attempt to force a big, ugly, contrived read for as much data as possible. We have it scheduled to run every 15 minutes, and it does a great job in general.

Without debating other bottlenecks, hardware specs, query plans, or query optimization, does anybody have any better ideas about how to accomplish this same task?

UPDATE
Thanks for the suggestions. Removed the "INTO #Cache". Tested & it didn't make a difference on filling the buffer.
Added: Instead of Select *, I'm selecting ONLY the keys from the Index. This (obviously) is more to-the-point and is much faster.
Added: Read & Cache Constraint Indexes also.

Here's the current code: (hope it's useful for somebody else)

CREATE VIEW _IndexView as -- Easy way to access sysobject and sysindex data SELECT  so.name as tablename, si.name as indexname, CASE si.indid WHEN 1 THEN 1 ELSE 0 END as isClustered, CASE WHEN (si.status & 2)<>0 then 1 else 0 end as isUnique, dbo._GetIndexKeys(so.name, si.indid) as Keys,     CONVERT(bit,CASE WHEN EXISTS (SELECT * FROM sysconstraints sc WHERE object_name(sc.constid) = si.name) THEN 1 ELSE 0 END) as IsConstraintIndex FROM    sysobjects so INNER JOIN sysindexes si ON so.id = si.id WHERE   (so.xtype = 'U')--User Table AND     ((si.status & 64) = 0) --Not statistics index AND (   (si.indid = 0) AND (so.name <> si.name) --not a default clustered index         OR         (si.indid > 0)     ) AND si.indid <> 255 --is not a system index placeholder  UNION SELECT  so.name as tablename, si.name as indexname, CASE si.indid WHEN 1 THEN 1 ELSE 0 END as isClustered, CASE WHEN (si.status & 2)<>0 then 1 else 0 end as isUnique, dbo._GetIndexKeys(so.name, si.indid) as Keys, CONVERT(bit,0) as IsConstraintIndex FROM    sysobjects so INNER JOIN sysindexes si ON so.id = si.id WHERE   (so.xtype = 'V')--View AND     ((si.status & 64) = 0) --Not statistics index GO   CREATE PROCEDURE _CacheTableToSQLMemory @tablename varchar(100) AS BEGIN DECLARE @indexname varchar(100) DECLARE @xtype varchar(10) DECLARE @SQL varchar(MAX) DECLARE @keys varchar(1000)  DECLARE @cur CURSOR SET @cur = CURSOR FOR SELECT  v.IndexName, so.xtype, v.keys FROM    _IndexView v INNER JOIN sysobjects so ON so.name = v.tablename WHERE   tablename = @tablename  PRINT 'Caching Table ' + @Tablename OPEN @cur FETCH NEXT FROM @cur INTO @indexname, @xtype, @keys WHILE (@@FETCH_STATUS = 0) BEGIN         PRINT '    Index ' + @indexname         --BEGIN TRAN             IF @xtype = 'V'                 SET @SQL = 'SELECT ' + @keys + ' FROM ' + @tablename + ' WITH (noexpand, INDEX (' + @indexname + '))' --             ELSE                 SET @SQL = 'SELECT ' + @keys + ' FROM ' + @tablename + ' WITH (INDEX (' + @indexname + '))' --              EXEC(@SQL)         --ROLLBACK TRAN         FETCH NEXT FROM @cur INTO @indexname, @xtype, @keys END CLOSE @cur DEALLOCATE @cur  END GO 
like image 375
El Mark Avatar asked May 22 '14 20:05

El Mark


People also ask

Does restarting SQL Server clear cache?

Yes it does. Since SQL Server is in charge of maintaining those caches, when you shut down SQL server's services, it releases the memory back to the operating system. Thank you!

How do I flush a query plan in SQL Server?

We can use the DBCC FREEPROCCACHE command to clear the procedural cache in SQL Server. We might drop a single execution plan or all plans from the buffer cache. SQL Server needs to create new execution plans once the user reruns the query.

How do you clear the buffer pool in SQL Server?

By cleaning the buffer pool before each test run SQL Server will have to re-read the data it needs from disk. To clean the buffer pool you execute the command: DBCC DROPCLEANBUFFERS. Next you should remove your execution plans from the procedure cache.


2 Answers

First of all, there is a setting called "Minumum Server Memory" that looks tempting. Ignore it. From MSDN:

The amount of memory acquired by the Database Engine is entirely dependent on the workload placed on the instance. A SQL Server instance that is not processing many requests may never reach min server memory.

This tells us that setting a larger minimum memory won't force or encourage any pre-caching. You may have other reasons to set this, but pre-filling the buffer pool isn't one of them.

So what can you do to pre-load data? It's easy. Just set up an agent job to do a select * from every table. You can schedule it to "Start automatically when Sql Agent Starts". In other words, what you're already doing is pretty close to the standard way to handle this.

However, I do need to suggest three changes:

  1. Don't try to use a temporary table. Just select from the table. You don't need to do anything with the results to get Sql Server to load your buffer pool: all you need to do is the select. A temporary table could force sql server to copy the data from the buffer pool after loading... you'd end up (briefly) storing things twice.
  2. Don't run this every 15 minutes. Just run it once at startup, and then leave it alone. Once allocated, it takes a lot to get Sql Server to release memory. It's just not needed to re-run this over and over.
  3. Don't try to hint an index. Hints are just that: hints. Sql Server is free to ignore those hints, and it will do so for queries that have no clear use for the index. The best way to make sure the index is pre-loaded is to construct a query that obviously uses that index. One specific suggestion here is to order the results in the same order as the index. This will often help Sql Server use that index, because then it can "walk the index" to produce the results.
like image 104
Joel Coehoorn Avatar answered Oct 11 '22 23:10

Joel Coehoorn


This is not an answer, but to supplement Joel Coehoorn's answer, you can look at the table data in the cache using this statement. Use this to determine whether all the pages are staying in the cache as you'd expect:

USE DBMaint GO SELECT COUNT(1) AS cached_pages_count, SUM(s.used_page_count)/COUNT(1) AS total_page_count, name AS BaseTableName, IndexName, IndexTypeDesc FROM sys.dm_os_buffer_descriptors AS bd INNER JOIN ( SELECT s_obj.name, s_obj.index_id, s_obj.allocation_unit_id, s_obj.OBJECT_ID, i.name IndexName, i.type_desc IndexTypeDesc FROM ( SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id ,allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.hobt_id AND (au.type = 1 OR au.type = 3) UNION ALL SELECT OBJECT_NAME(OBJECT_ID) AS name, index_id, allocation_unit_id, OBJECT_ID FROM sys.allocation_units AS au INNER JOIN sys.partitions AS p ON au.container_id = p.partition_id AND au.type = 2 ) AS s_obj LEFT JOIN sys.indexes i ON i.index_id = s_obj.index_id AND i.OBJECT_ID = s_obj.OBJECT_ID ) AS obj ON bd.allocation_unit_id = obj.allocation_unit_id INNER JOIN sys.dm_db_partition_stats s ON s.index_id = obj.index_id AND s.object_id = obj.object_ID WHERE database_id = DB_ID() GROUP BY name, obj.index_id, IndexName, IndexTypeDesc ORDER BY obj.name; GO 
like image 41
Dave.Gugg Avatar answered Oct 11 '22 21:10

Dave.Gugg