Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL - Calculating fields on the fly vs storing calculated data

Tags:

php

mysql

field

I apologise if this has been asked before, but I can't seem to find an answer to a question that I have about calculating on the fly vs storing fields in a database.

I read a few articles that suggested it was preferable to calculate when you can, but I would just like to know if that still applies to the following 2 examples.

Example 1. Say you are storing data relating to a car. You store the fuel tank size in litres, and how many litres it uses per 100km. You also want to know how many KMs it can travel, which can be calculated from the tank size and economy. I see 2 ways of doing this:

  1. When a car is added or updated, calculate the amount of KMs and store this as a static field in the database.
  2. Every time a car is accessed, calculate the amount of KMs on the fly.

Because the cars economy/tank size doesn't change (although it could be edited), the KMs is a pretty static value. I don't see why we would calculate it every single time the car is accessed. Wouldn't this waste cpu time as opposed to simply storing it in a separate field in the database and calculating only when a car is added or updated?

My next example, which is almost an entirely different question (but on the same topic), relates to counting children.

Let's say we have a app which has categories and items. We have a view where we display all the categories, and a count of all the items inside each category. Again, I'm wondering what's better. To perform a MySQL query to count all the items in each category every single time the page is accessed? Or store the count in a field in the categories table and update when an item is added / deleted?

I know it is redundant to store anything that can be calculated, but I worry that calculating fields or counting records might be slow as opposed to storing the data in a field. If it's not then please let me know, I just want to learn about when to use either method. On a small scale I guess it wouldn't matter either way, but apps like Facebook, would they really count the amount of friends you have every time someone views your profile or would they just store it as a field?

I'd appreciate any responses to both of these scenarios, and any resource that might explain the benefits of calculating vs storing.

Thanks in advance,

Christian

like image 249
Christian Avatar asked Dec 22 '10 11:12

Christian


2 Answers

Introducing redundancy into the database is a valid means of optimization. As with all optimizations, don't do it unless you have confirmed that this is where the bottleneck actually is.

like image 142
Oswald Avatar answered Oct 22 '22 17:10

Oswald


One thing to notice is the way you use your data. If several applications, or several layers of your application (maybe old code and new code in the same app) is accessing your data you'll reduce the risk of errors in computing by pre-calculating in the database. Then your calculated data will always be the same, no matter which application is requesting it.

For your first example, there is no reason that someone someday will have to change the way your KMs will need to be computed. I would store it in database (via triggers or via PHP on the insert/update -- because MySQl triggers are... well they are... not as good as some other DB triggers).

Now if we taking your second example it's really not sure someone will not want some day to add some filters on that categories computing. For example, take only children which are between 2 and 5. Then all your pre-computed results serves nothing. If you need some optimizations and caches of theses things it's maybe more an application-layer cache you would need, something like memcache, or pre-computed results stored in a cache table. But this cache is an application cache, which is related in a certain way on your application parameters (requests with different filters would use a different record in the cache).

Note that with MySQl you've got as well a nice query cache which will prevent the same query to be computed too much.

like image 34
regilero Avatar answered Oct 22 '22 15:10

regilero