Is it better, for indexing in the long run, to delete or "deactivate" rows, or under what circumstances?
I notice overhead generated after deleting a significant number of rows from a table. Why does this occur and what can be done 1) to prevent it and 2) to fix it?
If large volumes/ old or archived history are going to be deleted -- delete them outright.
In the near term, for manual user-level deletion, a "soft delete" is typically preferred. Manual deletions probably won't be more than about 10% of records, so index effectiveness will remain high.
"Soft delete" also has major benefits that admin can un-delete mistakenly deletions, and referential integrity & the referenced details of a transaction are all happily preserved!
For long-term archiving/deletion, you want to remove those records from the index -- and, apart from proprietary & database-specific "conditional indexing" which I'd tend to avoid, deleting them from the table is the only way to remove them from the index.
For SQL Server...
I think it's important to know that if you're deleting all records for a very large table (meaning lots of records), you'll want to truncate first, then drop the indexes. It's much more efficient.
If you want to remove a subset of the records, and you have indexes applied, use the DELETE FROM {table} WHERE {condition} syntax. If you do, you must delete from the dependent tables first in the order of the dependency hierarchy. Basically the exact opposite of how you insert records, starting with the non-dependent tables first.
DELETE records with table dependency hierarchy:
Dependent/Child Table (depends on the Dependency table):
DELETE FROM [table_dependent]; -- "dependent" is a relative term since this may be part of a hierarchy; a FK in this table points to the PK of the [table_independent] table; in a physical database model, this table is sometimes referred to as the child table
Dependency/Parent Table:
DELETE FROM [table_independent]; -- "independent" is a relative term since this may be part of a hierarchy; the PK of this table has a FK in a [table_dependent] table; in a physical database model, this is sometimes referred to as the parent table.
Note:
If there is a hierarchy, records from the "deepest" Dependent table should be deleted first. Which means indexes for this table must be deleted first as well. Then you should work up the hierarchy until you reach the parent table.
INSERT records with table dependency hierarchy:
SET IDENTITY_INSERT [table_independent] ON
INSERT INTO [table_independent]
(
[column that is not identity column],
[column that is not identity column],
[column that is not identity column]
)
VALUES
(
'1',
'2',
'3'
);
SET IDENTITY_INSERT [table_independent] OFF
SET IDENTITY_INSERT [table_dependent] ON
INSERT INTO [table_dependent]
(
[column that is not identity column],
[column that is not identity column],
[table_independent fk column]
)
VALUES
(
'1',
'2',
'3'
);
SET IDENTITY_INSERT [table_dependent] OFF
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With