I'm trying to think ahead a little and avoid myself some extra pain if possible.
I've had this problem in past applications and have usually opted for the most verbose approach but would like the opinions of a few others.
If you have a basic table such as below, is it wise, and/or more efficient to include a field which includes a calculation from the information which could be found from two other columns. IE:
+-----+---------+------------+-------+--------+-------+
| id | room_id | bookdate | price | people | total |
+-----+---------+------------+-------+--------+-------+
| 414 | 132 | 2010-03-01 | 14.55 | 2 | 29.10 |
| 415 | 132 | 2010-03-02 | 14.55 | 2 | 29.10 |
| 416 | 132 | 2010-03-03 | 14.55 | 2 | 29.10 |
+-----+---------+------------+-------+--------+-------+
The information in the last field could be extracted from the product of the previous two, therefore it is redundant and unnecessary. Are there instances when it could still be worthwhile to have it?
As a rule of thumb, I don't store values that can be calculated (especially ones that can be easily calculated) on the fly unless there is a performance issue and I need to save some processing time.
This is a classic tradeoff between performance and storage. I would recommend calculating the value until you need a performance boost.
Maybe create a table, that contains all fields except last one, and then create a view that contains all fields and counts the last fields automatically?
So the table would contain only these fields
+-----+---------+------------+-------+--------+
| id | room_id | bookdate | price | people |
+-----+---------+------------+-------+--------+
| 414 | 132 | 2010-03-01 | 14.55 | 2 |
And the definition of the view, that calculates the total is also very simple:
select *, price*people as total from rooms
(assuming that your table is called rooms
The general rule is that you shouldn't store what you can easily calculate, but if you've identified this area as being a performance bottleneck—by profiling your app, not guesswork—then do it.
If you decide to denormalize for read performance, you can add a check constraint to enforce consistency.
create table rooms (
price numeric,
people numeric,
total numeric check (total=price*people));
This will add slight overhead to inserts and updates.
I'm often in favor of a calculated field assuming you do it correctly by defining the field in the database as calculated. This way the calculation is always applicable no matter how the data changes. I'd only do this this though if you are going to need to get those calculations in reports that contain many records. Sure it's easy to write the formula inthe query, but if you calculate this number frequently you are wasting server resources (a calculated field only does the calc when the information changes) and possibly seriously slowing down the query if it must do the calc for millions of records for reports. A materialized view is also a good idea (becasue it will pre-calulate), but a regular view just gets you out of writing the calc multiple times, it doesn't have the performance benefit of a calculated field. On the other hand, I never create views if I don't need to (I.e. I can solve the problem so other way) as they can get you into real performance trouble when people start creating views on top of views. Don't use a hammer when a screwdriver is what you need.
Calculated fields are powerful tools when used properly and are often overlooked by database designers.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With