Should we use a flag for soft deletes, or a separate joiner table? Which is more efficient? Database is SQL Server.
Background Information
A while back we had a DB consultant come in and look at our database schema. When we soft delete a record, we would update an IsDeleted flag on the appropriate table(s). It was suggested that instead of using a flag, store the deleted records in a seperate table and use a join as that would be better. I've put that suggestion to the test, but at least on the surface, the extra table and join looks to be more expensive then using a flag.
Initial Testing
I've set up this test.
Two tables, Example and DeletedExample. I added a nonclustered index on the IsDeleted column.
I did three tests, loading a million records with the following deleted/non deleted ratios:
Results - 50/50
Results - 10/90
Results - 1/99
Database Scripts, For Reference, Example, DeletedExample, and Index for Example.IsDeleted
CREATE TABLE [dbo].[Example](
[ID] [int] NOT NULL,
[Column1] [nvarchar](50) NULL,
[IsDeleted] [bit] NOT NULL,
CONSTRAINT [PK_Example] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Example] ADD CONSTRAINT [DF_Example_IsDeleted] DEFAULT ((0)) FOR [IsDeleted]
GO
CREATE TABLE [dbo].[DeletedExample](
[ID] [int] NOT NULL,
CONSTRAINT [PK_DeletedExample] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[DeletedExample] WITH CHECK ADD CONSTRAINT [FK_DeletedExample_Example] FOREIGN KEY([ID])
REFERENCES [dbo].[Example] ([ID])
GO
ALTER TABLE [dbo].[DeletedExample] CHECK CONSTRAINT [FK_DeletedExample_Example]
GO
CREATE NONCLUSTERED INDEX [IX_IsDeleted] ON [dbo].[Example]
(
[IsDeleted] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
GO
The numbers you have seem to indicate that my initial impression was correct: if your most common query against this database is to filter on IsDeleted = 0
, then performance will be better with a simple bit flag, especially if you make wise use of indexes.
If you often query for deleted and undeleted data separately, then you could see a performance gain by having a table for deleted items and another for undeleted items, with identical fields. But denormalizing your data like this is rarely a good idea, as it will most often cost you far more in code maintenance costs than it will gain you in performance increases.
I'm not the SQL expert but in my opinion,it all depends on the usage frequency of the database. If the database is accessed by the large number of users and needs to be efficient then usage of a seperate isDeleted table will be good. The better option would be using a flag during the production time and as a part of daily/weekly/monthly maintanace you may move all the soft deleted records to the isDeleted table and clear the production table of soft deleted records. The mixture of both option will be good a good one.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With