Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is storing counts of database record redundant?

I'm using Rails and MySQL, and have an efficiency question based on row counting.

I have a Project model that has_many :donations.

I want to count the number of unique donors for a project.

Is having a field in the projects table called num_donors, and incrementing it when a new donor is created a good idea?

Or is something like @num_donors = Donor.count(:select => 'DISTINCT user_id') going to be similar or the same in terms of efficiency thanks to database optimization? Will this require me to create indexes for user_id and any other fields I want to count?

Does the same answer hold for summing the total amount donated?

like image 295
nfm Avatar asked Oct 03 '09 00:10

nfm


3 Answers

To answer the title question. Yes it is redundant, but whether you should do it depends on your situation.

Unless you have known performance problems, calculate the counts and totals on the fly in your application and don't store them. That is, don't store calculated values unless you have no other choice.

In most situations, you wont have to resort to this and shouldn't.

If you must store calculated values, do the following:

  • Don't keep it up-to date by incrementing it. Recalculate the count/total from all the data each time you update it.
  • If you don't have a lot of updates, put the code in an update trigger to keep the count/totals up to date.
  • The trouble with redundancy in databases is that when the numbers disagree, you are unsure of which is authoritative. Add to the documentation a note that the source data is the authoritative source if they disagree and can be overwritten.
like image 71
JohnFx Avatar answered Sep 23 '22 06:09

JohnFx


While it depends on the size of your database, these are the kinds of operations that databases specialize in, so they should be fast. It's probably a case of premature optimization here - you should start by not storing the totals, thus making it simpler - and optimize later if necessary.

like image 40
Peter Avatar answered Sep 20 '22 06:09

Peter


Peter's and JohnFx's answers are sound, what you're proposing is the denormalization of your database schema, which can improve read performance but at the detriment of writes while additionally putting the onus on the developer (or additional DBMS clevers) to prevent inconsistencies within your dataset.

ActiveRecord has some built in functionality to automatically manage counts on has_many relationships. Check out this Railscast on counter caches.

like image 20
fractious Avatar answered Sep 21 '22 06:09

fractious