Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to implement user defined fields and grouping for the multi-tenant application: EAV, fixed tables pattern, NoSQL

I'm working at a SaaS, where any tenant can have several lists of contacts, each list can have any number of custom fields contacts of this list can store and any number of groups the contancts of the list can be includes (groups are used for segmenting contacts of the list). Each contact has one required fields: email_address and any number of userdefined fields that are defined for the list where it is as I mentioned about. We must be able to find contacts of the lists base on the groups they're in and the values of user defined values. We must provise up to 30 user defined fields. I see now three ways of solving this problem:

  1. Using kind of EAV (we try to do it like this) but it looks rather complex. We have a table lists (lists of tenants), a related tables custom_fields, a related tables subscribers that stored email_addreses of subscribers of the list, table subscribers_custom_data which is related with subscribers and custom_fields tables (stored values of the custom fields of the subscribers).

  2. Field tables pattern. The descriptions of it is here http://blog.springsource.com/arjen/archives/2008/01/24/storing-custom-fields-in-the-database/. In thise case we'd use a field related to custom fields, that would store in columns all custom fields for example have 30 columns to store values of each possible custom fields and a table that stored mapping of columns name and name of the user defined field. It looks complex too. We'd have to have at least 30 indexes at least to search by the values of custom fields, there other problems too,

  3. To use some sort of NoSQL database at least for storing of user defined fields and maybe groups of the list. Do you think such databases can help here and if so how to design for storing custom fields and groups. I try to look at different types NoSQL, for example, document oriented like MongoDb, but right away I can't see how it can help solve this problem. We can store arbitrary attributes here but to search for the values of the custom fields we need to index them in advance so we have to know what custom fields we'll have.

Thank you for any information about it.

like image 842
Oleg Avatar asked Nov 21 '10 08:11

Oleg


1 Answers

If you want all fields to be indexed all the time, try a technology like Apache Solr that indexes everything. The main purpose of Solr is to be a fulltext search engine, but it's basically a document-oriented database.

Here are comments about other options:

  1. EAV is no good, and I'm against use it. It breaks many rules of relational database design, and it won't scale. I've written about this on Stack Overflow a lot, so search for my answers under the eav tag.

  2. You don't need just 30 indexes -- you need up to 30-factorial indexes to handle any possible combination of indexes. Keep in mind you can make multi-column indexes, and these types of indexes are important to support certain queries. Of course this is totally impractical to create so many indexes; you need to create indexes to match the queries you want to optimize for. If you don't know what fields you'll have and what queries you'll have against them, you can't optimize.

  3. Document-oriented databases like MongoDB/CouchDB aren't magic, no matter how much their advocates try to claim that they are. They requires that you index documents for fast searches, and that means you need to know the indexable fields of a document.

    Creating an index at runtime is a problem, because it can take a long time, depending on how much data there is to index. You'll have to find a way to run the index creation "offline" (i.e. don't make the user wait for it during a single http request) and then notify them when it's complete.

  4. You should read about How FriendFeed uses MySQL to store schema-less data. They use a Serialized LOB, basically combine all the custom attributes together into one XML or JSON blob. So users can create any number of additional custom fields anytime they want. But before a given custom field can be made searchable, you would create a child table that references rows where that field contains a given value. Thus you get an index that is only as large as the number of instances of a given user-defined custom field. And you don't need to make every field searchable.

like image 181
Bill Karwin Avatar answered Nov 02 '22 05:11

Bill Karwin