I have a routine in our .NET web application that allows a user on our platform to clear their account (i.e. delete all their data). This routine runs in a stored procedure and essentially loops through the relevant data tables and clears down all the various items they have created.
The stored procedure looks something like this.
ALTER procedure [dbo].[spDeleteAccountData](
@accountNumber varchar(30) )
AS
BEGIN
SET ANSI_NULLS ON ;
SET NOCOUNT ON;
BEGIN TRAN
BEGIN TRY
DELETE FROM myDataTable1 WHERE accountNumber = @accountNumber
DELETE FROM myDataTable2 WHERE accountNumber = @accountNumber
DELETE FROM myDataTable3 WHERE accountNumber = @accountNumber
//Etc.........
END TRY
BEGIN CATCH
//CATCH ERROR
END CATCH
IF @@TRANCOUNT > 0
COMMIT TRANSACTION;
SET ANSI_NULLS OFF;
SET NOCOUNT OFF;
END
The problem is that in some cases we can have over 10,000 rows on a table and the procedure can take up to 3-5 minutes. During this period all the other connections on the database get throttled causing time-out errors like the one below:
System.Data.SqlClient.SqlException (0x80131904): Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
Are there any general changes I can make to improve performance? I appreciate there are many unknowns related to the design of our database schema, but general best practice advice would be welcomed! I thought about scheduling this task to run during the early hours to minimise impact, but this is far from Ideal as the user wouldn't be able to regain access to their account until this task had been completed.
Additional Information:
Edit: 16:52 GMT
The delete proc affects around 20 tables. The largest one has approx 5 million records. The others have no more the 200,000, with some containing only 1000-2000 records.
Do you have an index on accountNumber
in all tables ?
Seeing that you delete using a WHERE
clause by that column, this might help.
Another option (and probably even better solution) would be to schedule deletion operations at night, e.g. when user selects to delete his account, you're only setting a flag, and a delete job runs at night actually deleting those accounts flagged for deletion.
If you have an index on the accountNumber field then I guess the long time for deletion is due to locks (generated by other processes) or to foreign keys affected by the respective tables.
Off course purists will blame me for the latter but I had been using this a lot of times when need arises.
One way you might want to try is this:
Example:
DECLARE @DeletedRowsCount INT = 1, @BatchSize INT = 300;
WHILE (@DeletedRowsCount> 0) BEGIN
BEGIN TRANSACTION
DELETE TOP (@BatchSize) dbo.Table
FROM dbo.Table
WHERE Id = @PortalId;
SET @DeletedRowsCount = @@ROWCOUNT;
COMMIT;
WAITFOR DELAY '00:00:05';
END
I guess you can do the same without a SP as well. In fact, it might be better like that.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With