Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running Updates on a large, heavily used table

Tags:

sql

I have a large table (~170 million rows, 2 nvarchar and 7 int columns) in SQL Server 2005 that is constantly being inserted into. Everything works ok with it from a performance perspective, but every once in a while I have to update a set of rows in the table which causes problems. It works fine if I update a small set of data, but if I have to update a set of 40,000 records or so it takes around 3 minutes and blocks on the table which causes problems since the inserts start failing.

If I just run a select to get back the data that needs to be updated I get back the 40k records in about 2 seconds. It's just the updates that take forever. This is reflected in the execution plan for the update where the clustered index update takes up 90% of the cost and the index seek and top operator to get the rows take up 10% of the cost. The column I'm updating is not part of any index key, so it's not like it reorganizing anything.

Does anyone have any ideas on how this could be sped up? My thought now is to write a service that will just see when these updates have to happen, pull back the records that have to be updated, and then loop through and update them one by one. This will satisfy my business needs but it's another module to maintain and I would love if I could fix this from just a DBA side of things.

Thanks for any thoughts!

like image 547
kylememe Avatar asked Jan 04 '12 17:01

kylememe


2 Answers

Actually it might reorganise pages if you update the nvarchar columns. Depending on what the update does to these columns they might cause the record to grow bigger than the space reserved for it before the update. (See explanation now nvarchar is stored at http://www.databasejournal.com/features/mssql/physical-database-design-consideration.html.)

So say a record has a string of 20 characters saved in the nvarchar - this takes 20*2+2(2 for the pointer) bytes in space. This is written at the initial insert into your table (based on the index structure). SQL Server will only use as much space as your nvarchar really takes.

Now comes the update and inserts a string of 40 characters. And oops, the space for the record within your leaf structure of your index is suddenly too small. So off goes the record to a different physical place with a pointer in the old place pointing to the actual place of the updated record.

This then causes your index to go stale and because the whole physical structure requires changing you see a lot of index work going on behind the scenes. Very likely causing an exclusive table lock escalation.

Not sure how best to deal with this. Personally if possible I take an exclusive table lock, drop the index, do the updates, reindex. Because your updates sometimes cause the index to go stale this might be the fastest option. However this requires a maintenance window.

like image 170
KMB Avatar answered Oct 19 '22 02:10

KMB


You should batch up your update into several updates (say 10000 at a time, TEST!) rather than one large one of 40k rows.

This way you will avoid a table lock, SQL Server will only take out 5000 locks (page or row) before esclating to a table lock and even this is not very predictable (memory pressure etc). Smaller updates made in this fasion will at least avoid concurrency issues you are experiencing.

You can batch the updates using a service or firehose cursor.

Read this for more info: http://msdn.microsoft.com/en-us/library/ms184286.aspx

Hope this helps

Robert

like image 22
Robert Murphy Avatar answered Oct 19 '22 03:10

Robert Murphy