Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does deleting rows from a table affect its indexes?

Tags:

sql

delete-row

Is it better, for indexing in the long run, to delete or "deactivate" rows, or under what circumstances?

I notice overhead generated after deleting a significant number of rows from a table. Why does this occur and what can be done 1) to prevent it and 2) to fix it?

like image 328
reformed Avatar asked Oct 22 '13 04:10

reformed


2 Answers

If large volumes/ old or archived history are going to be deleted -- delete them outright.

In the near term, for manual user-level deletion, a "soft delete" is typically preferred. Manual deletions probably won't be more than about 10% of records, so index effectiveness will remain high.

"Soft delete" also has major benefits that admin can un-delete mistakenly deletions, and referential integrity & the referenced details of a transaction are all happily preserved!

For long-term archiving/deletion, you want to remove those records from the index -- and, apart from proprietary & database-specific "conditional indexing" which I'd tend to avoid, deleting them from the table is the only way to remove them from the index.

like image 59
Thomas W Avatar answered Oct 03 '22 02:10

Thomas W


For SQL Server...

I think it's important to know that if you're deleting all records for a very large table (meaning lots of records), you'll want to truncate first, then drop the indexes. It's much more efficient.

If you want to remove a subset of the records, and you have indexes applied, use the DELETE FROM {table} WHERE {condition} syntax. If you do, you must delete from the dependent tables first in the order of the dependency hierarchy. Basically the exact opposite of how you insert records, starting with the non-dependent tables first.

DELETE records with table dependency hierarchy:

Dependent/Child Table (depends on the Dependency table):

DELETE FROM [table_dependent]; -- "dependent" is a relative term since this may be part of a hierarchy; a FK in this table points to the PK of the [table_independent] table; in a physical database model, this table is sometimes referred to as the child table

Dependency/Parent Table:

DELETE FROM [table_independent];  -- "independent" is a relative term since this may be part of a hierarchy; the PK of this table has a FK in a [table_dependent] table; in a physical database model, this is sometimes referred to as the parent table.

Note:

If there is a hierarchy, records from the "deepest" Dependent table should be deleted first. Which means indexes for this table must be deleted first as well. Then you should work up the hierarchy until you reach the parent table.

INSERT records with table dependency hierarchy:

SET IDENTITY_INSERT [table_independent] ON

INSERT INTO [table_independent]
(
[column that is not identity column],
[column that is not identity column],
[column that is not identity column]
)
VALUES
(
'1',
'2',
'3'
);

SET IDENTITY_INSERT [table_independent] OFF

SET IDENTITY_INSERT [table_dependent] ON

INSERT INTO [table_dependent]
(
[column that is not identity column],
[column that is not identity column],
[table_independent fk column]
)
VALUES
(
'1',
'2',
'3'
);

SET IDENTITY_INSERT [table_dependent] OFF
like image 31
MacGyver Avatar answered Oct 03 '22 02:10

MacGyver