Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Schema to support dynamic properties

I'm working on an editor that enables its users to create "object" definitions in real-time. A definition can contain zero or more properties. A property has a name a type. Once a definition is created, a user can create an object of that definition and set the property values of that object.

So by the click of a mouse-button, the user should ie. be able to create a new definition called "Bicycle", and add the property "Size" of type "Numeric". Then another property called "Name" of type "Text", and then another property called "Price" of type "Numeric". Once that is done, the user should be able to create a couple of "Bicycle" objects and fill in the "Name" and "Price" property values of each bike.

Now, I've seen this feature in several software products, so it must be a well-known concept. My problem started when I sat down and tried to come up with a DB schema to support this data structure, because I want the property values to be stored using the appropriate column types. Ie. a numeric property value is stored as, say, an INT in the database, and a textual property value is stored as VARCHAR.

First, I need a table that will hold all my object definitions:

Table obj_defs

id | name      |
----------------
 1 | "Bicycle" |
 2 | "Book"    |

Then I need a table for holding what sort of properties each object definition should have:

Table prop_defs

id | obj_def_id | name      | type |
------------------------------------
 1 |          1 | "Size"    |    ? |
 2 |          1 | "Name"    |    ? |
 3 |          1 | "Price"   |    ? |
 4 |          2 | "Title"   |    ? |
 5 |          2 | "Author"  |    ? |
 6 |          2 | "ISBN"    |    ? |

I would also need a table that holds each object:

Table objects

id | created    | updated    |
------------------------------
 1 | 2011-05-14 | 2011-06-15 |
 2 | 2011-05-14 | 2011-06-15 |
 3 | 2011-05-14 | 2011-06-15 |

Finally, I need a table that will hold the actual property values of each object, and one solution is for this table to have one column for each possible value type, such as this:

Table prop_vals

id | prop_def_id | object_id | numeric | textual | boolean |
------------------------------------------------------------
 1 |           1 |         1 |      27 |         |         |
 2 |           2 |         1 |         |  "Trek" |         |
 3 |           3 |         1 |    1249 |         |         |
 4 |           1 |         2 |      26 |         |         |
 5 |           2 |         2 |         |    "GT" |         |
 6 |           3 |         2 |     159 |         |         |
 7 |           4 |         3 |         |    "It" |         |
 8 |           5 |         3 |         |  "King" |         |
 9 |           6 |         4 |       9 |         |         |

If I implemented this schema, what would the "type" column of the prop_defs table hold? Integers that each map to a column name, varchars that simply hold the column name? Any other possibilities? Would a stored procedure help me out here in some way? And what would the SQL for fetching the "name" property of object 2 look like?

like image 468
Johan Fredrik Varen Avatar asked Jan 16 '11 14:01

Johan Fredrik Varen


People also ask

What is a dynamic schema?

What many claim as the best of both worlds, a dynamic schema is one that changes as you add data. There is no need to define the schema beforehand. When data is inserted, updated, or removed, the database builds a schema dynamically. Popular dynamic schema databases include HarperDB and MongoDB.

What is static and dynamic schema?

In short, Static Schema is a fixed schema configured when creating the workspace; Dynamic Schema is a schema which would be dynamically configured at run-time based on a resource (schema of the source dataset in many cases).

What are the dynamic properties?

Dynamic properties of structures characterize a system in form of natural frequencies, damping and mode shape. These dynamic properties are used to formulate mathematical model for its behavior. The evaluation of dynamic properties of is Nnown as modal analysis.


2 Answers

You are implementing something called Entity-Attribute-Value model http://en.wikipedia.org/wiki/Entity-attribute-value_model.

Lots of folks will say it's a bad idea (usually I am one of those) because the answer to your last question, "What would the SQL for fetching..." tends to be "thick hairy and nasty, and gettting worse."

These criticisms tend to hold once you allow users to start nesting objects inside of other objects, if you do not allow that, the situation will remain manageable.

For your first question, "what would the "type" column of the prop_defs table hold", everything will be simpler if you have a table of types and descriptions that holds {"numeric","Any Number"}, {"textual","String"}, etc. The first value is the primary key. Then in prop_defs your column "type" is a foreign key to that table and holds values "numeric", "textual", etc. Some will tell you incorrectly to always use integer keys because they JOIN faster, but if you use the values "numeric", "textual" etc. you don't have to JOIN and the fastest JOIN is the one you don't do.

The query to grab a single value will have a CASE statement:

SELECT case when pd.type = "numeric" then pv.numeric
            when pd.type = "textual" then pv.textual
            when pd.type = "boolean" then pv.boolean
  from prov_vals pv 
  JOIN prop_defs pd ON pv.prop_def_id = pv.id
 WHERE pv.object_id = 2
   AND pd.name = "Name"
like image 52
Ken Downs Avatar answered Sep 27 '22 20:09

Ken Downs


You must accept that relational databases are not good at providing this kind of functionality. They CAN provide it, but are not good at it. (I hope I'm wrong). Relational databases lend themselves better to defined interfaces, not changing interfaces.

--EAV tables give dynamic fields but suck on performance. Sucks on indexing. And it is complex to query. It gets the job done in many situations, but can fall apart on big tables with lots of users hitting the system.

--"Regular" tables with several place holder columns are OK for performance, but you get non-descriptive column names and are limited in the number of columns you can "add". Also it does not support sub-type separation.

--Typically you create/modify tables at development time, not run time. Should we really discriminate against modifying the database at run time? maybe, maybe not. Creating new tables, foreign keys, and columns at run-time can achieve true dynamic objects, while giving the performance benefits of "regular" tables. But you would have to query the schema of the database, then dynamically generate all of your queries. That would suck. It would totally break the concept of tables as an interface.

like image 33
Lord Tydus Avatar answered Sep 27 '22 19:09

Lord Tydus