Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best practice for a SQL Archiving Stored Procedure

I have a very large database (~100Gb) primarily consisting of two tables I want to reduce in size (both of which have approx. 50 million records). I have an archive DB set up on the same server with these two tables, using the same schema. I'm trying to determine the best conceptual way of going about removing the rows from the live db and inserting them in the archive DB. In pseudocode this is what I'm doing now:

Declare @NextIDs Table(UniqueID)
Declare @twoYearsAgo = two years from today's date

Insert into @NextIDs 
     SELECT top 100 from myLargeTable Where myLargeTable.actionDate < twoYearsAgo

Insert into myArchiveTable
<fields>
SELECT <fields> 
FROM myLargeTable INNER JOIN @NextIDs on myLargeTable.UniqueID = @NextIDs.UniqueID

DELETE MyLargeTable
FROM MyLargeTable INNER JOIN @NextIDs on myLargeTable.UniqueID = @NextIDs.UniqueID

Right now this takes a horrifically slow 7 minutes to complete 1000 records. I've tested the Delete and the Insert, both taking approx. 3.5 minutes to complete, so its not necessarily one is drastically more inefficient than the other. Can anyone point out some optimization ideas in this?

Thanks!

This is SQL Server 2000.

Edit: On the large table there is a clustered index on the ActionDate field. There are two other indexes, but neither are referenced in any of the queries. The Archive table has no indexes. On my test server, this is the only query hitting the SQL Server, so it should have plenty of processing power.

Code (this does a loop in batches of 1000 records at a time):

 DECLARE @NextIDs TABLE(UniqueID int primary key)
DECLARE @TwoYearsAgo datetime
SELECT @TwoYearsAgo = DATEADD(d, (-2 * 365), GetDate())

WHILE EXISTS(SELECT TOP 1 UserName FROM [ISAdminDB].[dbo].[UserUnitAudit] WHERE [ActionDateTime] < @TwoYearsAgo)
BEGIN

BEGIN TRAN

--get all records to be archived
INSERT INTO @NextIDs(UniqueID)
        SELECT TOP 1000 UniqueID FROM [ISAdminDB].[dbo].[UserUnitAudit] WHERE [UserUnitAudit].[ActionDateTime] < @TwoYearsAgo

--insert into archive table
INSERT INTO [ISArchive].[dbo].[userunitaudit] 
(<Fields>)
SELECT  <Fields>
FROM  [ISAdminDB].[dbo].[UserUnitAudit] AS a
        INNER JOIN @NextIDs AS b ON a.UniqueID = b.UniqueID

--remove from Admin DB
DELETE [ISAdminDB].[dbo].[UserUnitAudit] 
FROM  [ISAdminDB].[dbo].[UserUnitAudit] AS a
INNER JOIN @NextIDs AS b ON a.UniqueID = b.UniqueID 

DELETE FROM @NextIDs

COMMIT

END
like image 938
Kevin Avatar asked Dec 22 '22 07:12

Kevin


1 Answers

You effectively have three selects which need to be run before your insert/delete commands are executed:

for the 1st insert:

SELECT top 100 from myLargeTable Where myLargeTable.actionDate < twoYearsAgo

for the 2nd insert:

SELECT <fields> FROM myLargeTable INNER JOIN NextIDs 
on myLargeTable.UniqueID = NextIDs.UniqueID

for the delete:

(select *)
FROM MyLargeTable INNER JOIN NextIDs on myLargeTable.UniqueID = NextIDs.UniqueID

I'd try and optimize these and if they are all quick, then the indexes may be slowing down your writes. Some suggestions:

  1. start profiler and see what's happenng with the reads/writes etc.

  2. check index usage for all three statements.

  3. try running the SELECTs returning only the PK, to see if the delay is query execution or fetching the data (do have e.g. any fulltext-indexed fields, TEXT fields etc.)

like image 134
davek Avatar answered Dec 26 '22 12:12

davek