Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to store data with dynamic number of attributes in a database

Tags:

I have a number of different objects with a varying number of attributes. Until now I have saved the data in XML files which easily allow for an ever changing number of attributes. But I am trying to move it to a database.

What would be your preferred way to store this data?

A few strategies I have identified so far:

  • Having one single field named "attributes" in the object's table and store the data serialized or json'ed in there.
  • Storing the data in two tables (objects, attributes) and using a third to save the relations, making it a true n:m relation. Very clean solution, but possibly very expensive to fetch an entire object and all its attributes
  • Identifying attributes all objects have in common and creating fields for these to the object's table. Store the remaining attributes as serialized data in another field. This has an advantage over the first strategy, making searches easier.

Any ideas?

like image 742
Jörg Avatar asked Sep 18 '09 13:09

Jörg


People also ask

What is dynamic data in database?

Dynamic data or transactional data is information that is periodically updated, meaning it changes asynchronously over time as new information becomes available. Data that is not dynamic is considered either static (unchanging) or persistent, which is data that is infrequently accessed and not likely to be modified.

How dynamic data is stored in SQL Server?

First, declare two variables, @table for holding the name of the table from which you want to query and @sql for holding the dynamic SQL. Second, set the value of the @table variable to production. products . Fourth, call the sp_executesql stored procedure by passing the @sql parameter.

What is the best way to store an attribute of large string values in SQL?

We can use varchar(<maximum_limit>) . The maximum limit that we can pass is 65535 bytes.


1 Answers

If you ever plan on searching for specific attributes, it's a bad idea to serialize them into a single column, since you'll have to use per-row functions to get the information out - this rarely scales well.

I would opt for your second choice. Have a list of attributes in an attribute table, the objects in their own table, and a many-to-many relationship table called object attributes.

For example:

objects:     object_id    integer     object_name  varchar(20)     primary key  (object_id) attributes:     attr_id      integer     attr_name    varchar(20)     primary key  (attr_id) object_attributes:     object_id    integer  references (objects.object_id)     attr_id      integer  references (attributes.attr_id)     oa_value     varchar(20)     primary key (object_id,attr_id) 

Your concern about performance is noted but, in my experience, it's always more costly to split a column than to combine multiple columns. If it turns out that there are performance problems, it's perfectly acceptable to break 3NF for performance reasons.

In that case I would store it the same way but also have a column with the raw serialized data. Provided you use insert/update triggers to keep the columnar and combined data in sync, you won't have any problems. But you shouldn't worry about that until an actual problem surfaces.

By using those triggers, you minimize the work required to only when the data changes. By trying to extract sub-column information, you do unnecessary work on every select.

like image 159
paxdiablo Avatar answered Sep 20 '22 16:09

paxdiablo