Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Storing "derived" values vs calculating them on extraction

When you have values which only depends on one or more other fields +/- constants (say retail price & discount price), is it better to store those values too or to calculate them "on the fly" when retrieving the data?

like image 605
Matteo Riva Avatar asked Apr 21 '10 07:04

Matteo Riva


2 Answers

The default is not to store redundant information: the third normal form is usually a sensible initial goal. Redundancy is introduced when a "good enough" reason appears, such as a "big enough" performance hit you take when you have to calculate a derived value and the calculation is intensive.

Obviously, "good enough" and "big enough" are qualifiers which only mean something in a given context. For what it's worth, the retail/discount price calculation seems too cheap and simple to do to warrant the introduction of a redundant column in most (obviously not all) cases.

like image 124
Tomislav Nakic-Alfirevic Avatar answered Nov 11 '22 03:11

Tomislav Nakic-Alfirevic


I would agree with Tomislav - try to avoid redundancy because you can end up with data on multiple tables disagreeing with each other. It makes updates more painful.

There are exceptions that are worth considering, though, that are not related to database performance.

  • When it painful to calculate the value (e.g. some complex mathematical function), then it makes sense to store (you could imagine the column as the 'last calculated value').
  • You might have inputs that change over time, e.g. fee is derived from a fee rate, but the fee rate is stored as a single value in a configuration table. You might want to record the fee because historical fees would only be calculated from the current fee rate. Alternatively, you might store the rate by time as well to circumvent this problem.
  • If the derived value can be overriden by user input or some other process, then again it makes sense to store. Alternatively, you might model this with two states 'CALCULATED' and 'OVERRIDDEN', so that you only store a value in the latter state.
like image 45
Joel Goodwin Avatar answered Nov 11 '22 01:11

Joel Goodwin