I am currently working with an application using an Azure hosted SQL server instance. The application data doesn't take up a ton of physical space, however there are a lot of records. There are times where I need to delete a large amount of records, for example lets say 5 million records. As you might guess this takes a lot of time and resources. The issue is that I don't need a lot of resources for anything else. In order to not peg the DTU's at 100% for 30 minutes or longer I need to have many more resources that I need under normal use. I don't care how long the delete takes within reason. From what I have researched I cannot find a good way to limit the usage. It would be nice if I could somehow only allow 50% usage for the operation or something like that. Maybe I am missing something that could make the delete more efficient, but I don't think so. Its a pretty simple table with an index on the column I am using to do the delete. It seems like the main component that gets maxed out is Data IO. If anyone has any good ideas on how I can manage this it would be appreciated.
Up to 280, unless the instance storage size or Azure Premium Disk storage allocation space limit has been reached. 32,767 files per database, unless the instance storage size limit has been reached. Maximum size of each data file is 8 TB. Use at least two data files for databases larger than 8 TB.
Delete involves locating data,getting data from disk and logging those operations..
Locating data/Minimizing IO:
To ensure IO is minimized, you will need to add right index .
Some times some operators involved in delete may run in parallel,to avoid this , you will need to add maxdop hint to ensure nothing in this query runs parallel..
delete from table where somecol=someval
option(maxdop 1)
Minimizing log operation:
Every DML operation is logged,but when you do individual deletes, you will use more log IO(which is one of the DTU metric of AZure database)..you will have to delete in batches and ensure they are in one single transaction..
while 1=1
begin
delete top(1000) from table where id=someval
if @@rowcount =0
break;
end
go
You also can partition your tables to make deletes faster.Truncate can be now used with partitions starting with sql 2016..
TRUNCATE TABLE tablename
WITH (PARTITIONS (1,2,3))
GO
the syntax also allows you to specify range..
[ WITH ( PARTITIONS ( { <partition_number_expression> | <range> }
[ , ...n ] ) ) ]
Partition can help you more, only if, you want to delete all or nothing of a partition.if you are doing this types of deletes more, you may need to design your table to help truncate
Further reading and References:
https://www.sqlshack.com/sql-server-2016-enhancements-truncate-table-table-partitioning/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With