I am working on a client's database and there is about 1 million rows that need to be deleted due to a bug in the software. Is there an efficient way to delete them besides:
DELETE FROM table_1 where condition1 = 'value' ?
If you need to remove 10 million rows and have 1 GB of log space available use Delete TOP(10000) From dbo. myTable (with your select clause) and keep running it till there are no more rows to delete.
Millions of rows is not a problem, this is what SQL databases are designed to handle, if you have a well designed schema and good indexes.
If i say without loop, i can use GOTO statement for delete large amount of records using sql server. exa. like this way you can delete large amount of data with smaller size of delete.
Here is a structure for a batched delete as suggested above. Do not try 1M at once...
The size of the batch and the waitfor delay are obviously quite variable, and would depend on your servers capabilities, as well as your need to mitigate contention. You may need to manually delete some rows, measuring how long they take, and adjust your batch size to something your server can handle. As mentioned above, anything over 5000 can cause locking (which I was not aware of).
This would be best done after hours... but 1M rows is really not a lot for SQL to handle. If you watch your messages in SSMS, it may take a while for the print output to show, but it will after several batches, just be aware it won't update in real-time.
Edit: Added a stop time @MAXRUNTIME
& @BSTOPATMAXTIME
. If you set @BSTOPATMAXTIME
to 1, the script will stop on it's own at the desired time, say 8:00AM. This way you can schedule it nightly to start at say midnight, and it will stop before production at 8AM.
Edit: Answer is pretty popular, so I have added the RAISERROR
in lieu of PRINT
per comments.
DECLARE @BATCHSIZE INT, @WAITFORVAL VARCHAR(8), @ITERATION INT, @TOTALROWS INT, @MAXRUNTIME VARCHAR(8), @BSTOPATMAXTIME BIT, @MSG VARCHAR(500)
SET DEADLOCK_PRIORITY LOW;
SET @BATCHSIZE = 4000
SET @WAITFORVAL = '00:00:10'
SET @MAXRUNTIME = '08:00:00' -- 8AM
SET @BSTOPATMAXTIME = 1 -- ENFORCE 8AM STOP TIME
SET @ITERATION = 0 -- LEAVE THIS
SET @TOTALROWS = 0 -- LEAVE THIS
WHILE @BATCHSIZE>0
BEGIN
-- IF @BSTOPATMAXTIME = 1, THEN WE'LL STOP THE WHOLE JOB AT A SET TIME...
IF CONVERT(VARCHAR(8),GETDATE(),108) >= @MAXRUNTIME AND @BSTOPATMAXTIME=1
BEGIN
RETURN
END
DELETE TOP(@BATCHSIZE)
FROM SOMETABLE
WHERE 1=2
SET @BATCHSIZE=@@ROWCOUNT
SET @ITERATION=@ITERATION+1
SET @TOTALROWS=@TOTALROWS+@BATCHSIZE
SET @MSG = 'Iteration: ' + CAST(@ITERATION AS VARCHAR) + ' Total deletes:' + CAST(@TOTALROWS AS VARCHAR)
RAISERROR (@MSG, 0, 1) WITH NOWAIT
WAITFOR DELAY @WAITFORVAL
END
BEGIN TRANSACTION
DoAgain:
DELETE TOP (1000)
FROM <YourTable>
IF @@ROWCOUNT > 0
GOTO DoAgain
COMMIT TRANSACTION
Maybe this solution from Uri Dimant
WHILE 1 = 1
BEGIN
DELETE TOP(2000)
FROM Foo
WHERE <predicate>;
IF @@ROWCOUNT < 2000 BREAK;
END
(Link: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/b5225ca7-f16a-4b80-b64f-3576c6aa4d1f/how-to-quickly-delete-millions-of-rows?forum=transactsql)
Here is something I have used:
If the bad data is mixed in with the good-
INSERT INTO #table
SELECT columns
FROM old_table
WHERE statement to exclude bad rows
TRUNCATE old_table
INSERT INTO old_table
SELECT columns FROM #table
Not sure how good this would be but what if you do like below (provided table_1
is a stand alone table; I mean no referenced by other table)
create a duplicate table of table_1
like table_1_dup
insert into table_1_dup select * from table_1 where condition1 <> 'value';
drop table table_1
sp_rename table_1_dup table_1
If you cannot afford to get the database out of production while repairing, do it in small batches. See also: How to efficiently delete rows while NOT using Truncate Table in a 500,000+ rows table
If you are in a hurry and need the fastest way possible:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With