I am using Ruby on Rails 3.2.2 and MySQL. I would like to know if it is "advisable" / "desirable" to store in a database table related to a class all records related to two others classes for each "combination" of their instances.
That is, I have User
and Article
models. In order to store all user-article authorization objects, I would like to implement a ArticleUserAuthorization
model so that
given N users and M articles there are N*M ArticleUserAuthorization
records.
Making so, I can state and use ActiveRecord::Associations
as the following:
class Article < ActiveRecord::Base
has_many :user_authorizations, :class_name => 'ArticleUserAuthorization'
has_many :users, :through => :user_authorizations
end
class User < ActiveRecord::Base
has_many :article_authorizations, :class_name => 'ArticleUserAuthorization'
has_many :articles, :through => :article_authorizations
end
However, the above approach of storing all combinations will result in a big database table containing billions billions billions of rows!!! Furthermore, ideally speaking, I am planning to create all authorization records when an User
or an Article
object is created (that is, I am planning to create all previously mentioned "combinations" at once or, better, in "delayed" batches... in any way, this process creates other billions billions of database table rows!!!) and make the viceversa when destroying (by deleting billions billions of database table rows!!!). Furthermore, I am planning to read and update those rows at once when an User
or Article
object is updated.
So, my doubts are:
Note: I would use this approach because, in order to retrieve only "authorized objects" when retrieving User
or Article
objects, I think I need "atomic" user authorization rules (that is, one user authorization record for each user and article object) since the system is not based on user groups like "admin", "registered" and so on. So, I thought that the availability of a ArticleUserAuthorization
table avoids to run methods related to user authorizations (note: those methods involve some MySQL querying that could worsen performance - see this my previous question for a sample "authorization" method implementation) on each retrieved object by "simply" accessing / joining the ArticleUserAuthorization
table so to retrieve only "user authorized" objects.
The fact of the matter is that if you want article-level permissions per user then you need a way to relate User
s to the Article
s they can access. This neccesitates a minimum you need N*A (where A is the number of uniquely permissioned articles).
The 3NF approach to this would be, as you suggested, to have a UsersArticles
set... which would be a very large table (as you noted).
Consider that this table would be accessed a whole lot... This seems to me like one of the situations in which a slightly denormalized approach (or even noSQL) is more appropriate.
Consider the model that Twitter uses for their user follower tables:
Jeff Atwood on the subject
And High Scalability Blog
A sample from those pieces is a lesson learned at Twitter that querying followers from a normalized table puts tremendous stress on a Users
table. Their solution was to denormalize followers so that a user's follower's are stored on their individual user settings.
Denormalize a lot. Single handedly saved them. For example, they store all a user IDs friend IDs together, which prevented a lot of costly joins. - Avoid complex joins. - Avoid scanning large sets of data.
I imagine a similar approach could be used to serve article permissions and avoid a tremendously stressed UsersArticles
single table.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With