Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

INSERT INTO goes much slower with time in SQL Server 2012

We have a very big database WriteDB, which store raw trading data and we use this table to fast writes. Then with sql scripts I import data from WriteDB into ReadDB in comparatively the same table, but extended with some extra values + relation added. Import script is like that:

TRUNCATE TABLE [ReadDB].[dbo].[Price]
GO
INSERT INTO [ReadDB].[dbo].[Price]
SELECT a.*, 0 as ValueUSD, 0 as ValueEUR
from [WriteDB].[dbo].[Price] a
JOIN [ReadDB].[dbo].[Companies] b ON a.QuoteId = b.QuoteID

So initially there is around 130 mil. rows in this table (~50GB). Each day some of them added, some of them changes, so right now we decide not over complicate logic and just re-import all data. The problem that for some reason with time this script works longer and longer, on the almost same amount of data. First run it's take ~1h, now it's already taken 3h

Also SQL Server after import work not well. After import (or during it) if I try to run different queries, even the simplest they often fail with timeout errors.

What is the reason of such bad behavior and how to fix this?

like image 510
Ph0en1x Avatar asked Apr 21 '15 12:04

Ph0en1x


People also ask

Why insert is slow in SQL Server?

I know that an INSERT on a SQL table can be slow for any number of reasons: Existence of INSERT TRIGGERs on the table. Lots of enforced constraints that have to be checked (usually foreign keys) Page splits in the clustered index when a row is inserted in the middle of the table.

How do I make MySQL insert faster?

You can use the following methods to speed up inserts: If you are inserting many rows from the same client at the same time, use INSERT statements with multiple VALUES lists to insert several rows at a time. This is considerably faster (many times faster in some cases) than using separate single-row INSERT statements.

Which query is faster insert or update?

Insert would be faster because in case of update you need to first search for the record that you are going to update and then perform the update.


1 Answers

One theory is that your first 50GB dataset has filled available memory for caching. Upon truncating the table, your cache is now effectively empty. This alternating behavior makes effective use of the cache difficult and incurs a substantial number of cache misses / increased IO time.

Consider the following sequence of events:

  1. You load your initial dataset into WriteDb. During the load operation, pages in WriteDb are cached. There's very little memory contention because there's only one copy of the dataset and sufficient memory.
  2. You initially populate ReadDb. The pages required to populate ReadDb (the data in WriteDb) are already largely cached. Fewer reads are required from disk, and your IO time can be dedicated to writing the inserted data for ReadDb. (This is your fast first run.)
  3. You load your second dataset into WriteDb. During the load operation, there is insufficient memory to cache both existing data in ReadDb and new data written to WriteDb. This memory contention leads to fewer pages of WriteDb cached.
  4. You truncate ReadDb. This invalidates a substantial portion of your cache (i.e. the 50GB of ReadDb data that was cached).
  5. You then attempt your second load of ReadDb. Here you have very little of WriteDb cached, so your IO time is split between reading pages of WriteDb (your query) and writing pages of ReadDb (your insert). (This is your slow second run.)

You could test this theory by comparing the SQL Server cache miss ratio during your first and second load operations.

Some ways to improve performance might be to:

  • Use separate disk arrays for ReadDb / WriteDb to increase parallel IO performance.
  • Increase the available cache (amount of server memory) to accomodate the combined size of ReadDb + WriteDb and minimize cache misses.
  • Minimize the impact of each load operation on existing cached pages by using a MERGE statement instead of dumping / loading 50GB of data at a time.
like image 154
Michael Petito Avatar answered Sep 29 '22 10:09

Michael Petito