Deleting large number of rows from SQL Server - in an efficient and non-locking manner

Tags:

I am writing a procedure to delete all rows from a few tables over n days old.

A dead simple query is easy to write

DELETE FROM [myTable] 
WHERE [Created] < GETDATE()-30

One problem is there is no index on the date field - I could add one, but I was working around it by doing something like:

SELECT @var = MAX([ID]) FROM myTable WHERE Created < GETDATE()-30; 
DELETE FROM myTable WHERE ID < @var

Does that seem like an acceptable method?

The problem is the table is huge, and this query will be deleting likely hundreds of thousands of rows every run.

Running it on a (slightly slow) test server it is taking an hour or so, and killing the table from other processes trying to read/write to it.

I don't so much mind it taking a while to run (though quicker is better) - but I can not have it locking the table for an hour while it is running, as there are constant read/writes going on (mainly writes).

My DB knowledge is quite basic, as I'm a coder not a dba.

Can someone give me a decent method for performing this task - in the most efficient way possible.

874

asked May 05 '11 21:05

jb.

2 Answers

What you're looking for is a partitioned based sliding window: How to Implement an Automatic Sliding Window in a Partitioned Table on SQL Server 2005. Partition the table by day and you can efficiently drop an entire day in a single partition switch operation at midnight. Partition switch is basically instantaneous.

If you want a solution with slightly lower overhead (partitioning has serious consequences and ripples through the entire application, specially when the indexes have to be aligned, which is a requirement for fast switch operations), then you have to design your schema in accordance with this operation. With 99.99% confidence I can say that the leftmost clustered key of your myTable must be the Created field. This would allow efficient batch deletes (delete top (2500) from myTable where Created < ...). There are many reasons why you want this to be batched (top 2500 or so at a time), most important being that you must avoid lock escalation and you must keep the size of any individual transaction within reasonable limits.

109

answered Sep 30 '22 08:09

Remus Rusanu

Your method will suffer from the same malady as a normal delete would - you don't have an index on [Created]. Therefore your method is just more convoluted.

I would suggest that you create said index and try the normal delete on your test server.

Another suggestion - run this outside of normal business hours through a scheduler.

answered Sep 30 '22 10:09

Otávio Décio

Related questions
                            
                                how to enter values in rowguid column?
                            
                                mysql group by and count rows problem
                            
                                Generic SQL builder .NET
                            
                                Counting multiple rows in MySQL in one query
                            
                                How to define a connection string to a SQL Server 2008 database?
                            
                                Automatically create scripts for all SQL Server Jobs
                            
                                Updating one column with other's value in same table
                            
                                T-SQL: How to return 0 rows in from stored procedure, and how to use XACT_ABORT and TRY/CATCH
                            
                                Best way to store tags in a database?
                            
                                Why can't I get the @@rowcount value?
                            
                                How to monitor MySQL queries on Windows
                            
                                Get the last 24 hour job record form msdb.dbo.sysjobhistory
                            
                                SQL Server Management Studio 2008 R2 Export Structure + Data
                            
                                How to give the output of the first query(which has two values) as the input to the second?
                            
                                PL/SQL - How to return single row from a joined table
                            
                                Is this the correct way to create SQL assertion?
                            
                                Running Sums for Multiple Categories in MySQL
                            
                                Return count in Stored Procedure
                            
                                How to delete an attribute from an XML variable in sql server 2008?
                            
                                How to query for a primary key in oracle 11g

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Deleting large number of rows from SQL Server - in an efficient and non-locking manner

Tags:

sql

tsql

sql-server-2005

jb.

People also ask

2 Answers

Remus Rusanu

Otávio Décio

Recent Activity

Donate For Us