I have a large SQL Server database with a table at about 45 million records. I am archiving this table, and need to remove all entries greater than two years ago. I have the inserting into my archive table working fine, but I'm having issues with efficiency when deleting.
My problem lies within the indexes currently on the table. I would like to delete (and archival insert) in 1000 record chunks. To do this, I need to determine the "top" 1000 records fulfilling the requirement (greater than two years old). The DateTime stamp on the row is a clustered index, so this is great for grabbing the rows. However SQL 2000 does not allow DELETE TOP 1000.... so I need to do something like:
DELETE FROM <table> WHERE [UniqueID] IN
(SELECT TOP 1000 [UniqueID] FROM <table> WHERE [DateTime] < @TwoYearsAgo)
This would work great, if UniqueID was indexed. Since it is not, this takes a very long time (it is scanning the table for each of the 1000 records to be deleted). There are no other indexes on the table that uniquely identify the records. I am told it would be too costly to compute an index on UniqueID, as this is a live DB. Can anyone point out a way to optimize this query?
The DateTime stamp on the row is a clustered index, so this is great for grabbing the rows. However SQL 2000 does not allow DELETE TOP 1000.... so I need to do something like: This would work great, if UniqueID was indexed. Since it is not, this takes a very long time (it is scanning the table for each of the 1000 records to be deleted).
The syntax for the DELETE TOP statement in SQL Server (Transact-SQL) is: DELETE TOP (top_value) [ PERCENT ] FROM table [WHERE conditions]; Parameters or Arguments table The table that you wish to delete records from. WHERE conditions Optional. The conditions that must be met for the records to be deleted. TOP (top_value)
The SQL Server (Transact-SQL) DELETE TOP statement is used to delete records from a table in SQL Server and limit the number of records deleted based on a fixed value or percentage. The syntax for the DELETE TOP statement in SQL Server (Transact-SQL) is: The table that you wish to delete records from. Optional.
If it was 100,000,000 rows then your points might be valid, but for just 1000 rows, this is by far the simplest solution proposed so far for SQL 2008. It is fast. Try it: DELETE FROM YourTABLE FROM (SELECT TOP XX PK FROM YourTABLE) tbl WHERE YourTABLE.PK = tbl.PK
How about rewriting the query?
SET ROWCOUNT 1000
DELETE FROM <table> WHERE [DateTime] < @TwoYearsAgo
See documentation on SET ROWCOUNT (Transact-SQL).
Also note that per the documentation for DELETE, it supports the TOP
clause, but that is apparently new for SQL Server 2005 and up. I'm saying this since it sounds like it isn't supported on your database server, but have you actually tried using it? I don't have access to SQL Server 2000 documentation so I'm unsure if it is supported on that version. It very well might not be.
DELETE TOP (1000) FROM <table> WHERE [DateTime] < @TwoYearsAgo
Note the difference from the way TOP on select can be written, without the parenthesis. For UPDATE, DELETE and INSERT, the expression must be parenthesized, even if it's only a constant number like above.
You can delete a subquery:
DELETE <table> FROM (
SELECT TOP 1000 *
FROM <table>
WHERE [DateTime] < @TwoYearsAgo);
See the example E: at SQL 2000 DELETE Syntax. This is recommended over the SET ROWCOUNT approach. In SQL 2005 and later you can specify directly the TOP in DELETE.
you can also do
DELETE TOP(1000) FROM <table> WHERE [DateTime] < @TwoYearsAgo
God only knows why they use top(x) for delete and top x for select, most people don't even seem to know about this feature!
edit: Apparently its 2005+ so you should probably ignore this.
You could use SET ROWCOUNT:
SET ROWCOUNT 1000
DELETE FROM <table> WHERE [DateTime] < @TwoYearsAgo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With