I'm wondering if someone can provide various rationales/solutions for knowing when to delete records from a database vs. simplying hiding them during read operations via a field value, e.g., is_hidden=1
.
My application is a social network/e-commerce web application. I tend to favor the is_hidden
strategy but as one's site grows I can see this leading to a really badly performing site.
Here's my list. What items on the list am I missing? Is the list's prioritization good?
Delete
:
DELETE
is_hidden
:
CREATE
dataDELETE
it later if necessarySELECT ... WHERE is_hidden!=1
Thoughts?
The major reason you might want to do a soft-delete is where an audit trail requires it. For example we might have an invoice table along with an voided column and we might normally just omit voided invoices. This preserves an audit trail so we know what invoices were entered and which ones were voided.
There are many fields (particularly in finance) where soft deletes are preferred for this reason. Typically the number of deletes are small compared to the data set, and you don't want to really delete because actually doing so might allow someone to cover for theft of money or real-world goods. The "deleted" data can then be shown for those queries which require it.
A good non-db example would be as follows: "When writing in your general journal or general ledger, write with a pen and if you make an error that you spot right away, cross it out with a single line so that the original data is still legible, and write correct values underneath. If you find out later, either write in an adjustment entry or write in a reversal and a new one." In that case, your principle reason is to see what was changed when so that you can audit those changes if there is ever a question.
The people typically needing to see such information are likely to be financial or other auditors.
You've already said everthing in your question:
DELETE will entirely delete the entry and
is_hidden=1 will hide it.
So: If there's the possibility that you will need the data in the future you should use the hiding method. If you are sure that the data will never ever be used again: Use delete.
Concerning performance:
You can use two tables:
Or even three tables:
Or:
It's all up to you. But if you look at facebook or google: They will never ever delete anything! Data == Money == Power ;)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With