Recently, I stumbled about the following problem: Given is a simple data model with "Books" and "Authors". Each "Book" has a reference to an "Author". Persistence is achieved with a relational database. Besides adding books and authors, it is also possible to delete them. Usually, if I want to delete an Author, i would perform a SQL DELETE
operation and remove the corresponding row. However, I have seen in other projects, people don't call DELETE
. Instead, they add some kind of active/deleted flag and mark the corresponding row as "deleted".
My questions are: Is this in general best practice? What are the advantages? My best guess is:
DELETE
operationAnyway, these are just guesses. Does someone know the answer?
There are many reasons to not use delete
. First, maintaining history can be very important. I wouldn't use "just" a delete flag, but instead have dates of validity.
Second, in an operational system, delete
can be an expensive operation. The row needs to be deleted from the table, from associated indexes, and then there might be cascading deletes and triggers.
Third, delete
can prevent other operations from working well, because tables and rows and indexes get locked. This can slow down an operational system, particularly during peak periods.
Fourth, delete
can be tricky to maintain relational integrity -- especially if those cascading deletes are not defined.
Fifth, storage is cheap. Processing power is cheap. So, for many databases, deleting records to recover space is simply unnecessary.
This doesn't mean that you should always avoid deleting records. But there are very valid reasons for not rushing to remove data.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With