How would you structure your entity model for storing arbitrary key/value data with different data types?

Tags:

database-design

I keep coming across scenarios where it will be useful to store a set of arbitrary data in a table using a per-row key/value model, rather than a rigid column/field model. The problem is, I want to store the values with their correct data type rather than converting everything to a string. This means I have to choose either a single table with multiple nullable columns, one for each data type, or a set of value tables, one for each data type. I'm also unsure as to whether I should use full third normal form and separate the keys into a separate table, referencing them via a foreign key from the value table(s), or if it would be better to keep things simple and store the string keys in the value table(s) and accept the duplication of strings.

Old/bad:

This solution makes adding additional values a pain in a fluid environment because the table needs to be modified regularly.

MyTable
============================
ID    Key1    Key2    Key3
int   int     string  date
----------------------------
1     Value1  Value2  Value3
2     Value4  Value5  Value6

Single Table Solution

This solution allows simplicity via a single table. The querying code still needs to check for nulls to determine which data type the field is storing. A check constraint is probably also required to ensure only one of the value fields contains non-null data.

DataValues
=============================================================
ID    RecordID    Key    IntValue    StringValue    DateValue
int   int         string int         string         date
-------------------------------------------------------------
1     1           Key1   Value1      NULL           NULL
2     1           Key2   NULL        Value2         NULL
3     1           Key3   NULL        NULL           Value3
4     2           Key1   Value4      NULL           NULL
5     2           Key2   NULL        Value5         NULL
6     2           Key3   NULL        NULL           Value6

Multiple-Table Solution

This solution allows for more concise purposing of each table, though the code needs to know the data type in advance as it needs to query a different table for each data type. Indexing is probably simpler and more efficient because there are less columns that need indexing.

IntegerValues
===============================
ID    RecordID    Key    Value
int   int         string int
-------------------------------
1     1           Key1   Value1
2     2           Key1   Value4

StringValues
===============================
ID    RecordID    Key    Value
int   int         string string
-------------------------------
1     1           Key2   Value2
2     2           Key2   Value5

DateValues
===============================
ID    RecordID    Key    Value
int   int         string date
-------------------------------
1     1           Key3   Value3
2     2           Key3   Value6

How do you approach this problem? Which solution is better?

Also, should the key column be separated into a separate table and referenced via a foreign key or be should it be kept in the value table and bulk updated if for some reason the key name changes?

479

asked Jun 07 '10 08:06

Nathan Ridley

1 Answers

Another alternative to a single Table design is store the DataType

where you have an internal enum that differentiates the data type ie:

1 - string 2 - int 3 - date 4 - etc

Table

============================
ID    Key    Value    DataType
int   string string   int
----------------------------
1     Key  Value       1

I recommend storing plain text in the value as in the Pragmatic Programmer book, plain text will outlive all data types, and it allows you to perform any manipulation to it.

As what Thomas said, there is always a discipline trade off for EAVs. The discipline comes when you insert the data to ensure that the data type is validated and inserted as the correct type you expect.

When you query you just parse depending on data type on your code.

ie: if response.dataType == enum.date { parseDate(response)} else if response.dataType == enum.int { parseInt(response)} // etc etc

worst case in production if it fails in production and insert the incorrect datatype, you can just alter the datatype in your db and your code should parse accordingly

I just want to state/iterates that EAVs should be used in moderation, and there are certain trade offs going this path, and I recommend reading on them before continuing.

116

answered Sep 22 '22 16:09

mel3kings

Related questions
                            
                                Single Table Inheritance or Class Table Inheritance?
                            
                                MySQL performance of unique varchar field vs unique bigint
                            
                                Best Practices: Storing a workflow state of an item in a database? [closed]
                            
                                In what order are ON DELETE CASCADE constraints processed?
                            
                                How to design a star schema
                            
                                How to store meta-data on columns
                            
                                Table design for user's information as well as login credentials?
                            
                                Wufoo's Database Schema - How would you design it?
                            
                                Database modeller for JPA with visual diagrams
                            
                                What is the optimal indexing strategy for a relation table?
                            
                                Does allowing a category to have multiple parents make sense? Are there alternatives?
                            
                                best way to store 1:1 user relationships in relational database
                            
                                Ms-Access: any need to have a low size for variable-length text fields
                            
                                INT vs VARCHAR in search
                            
                                How to implement a super class, sub class relationship in the database?
                            
                                How to map one class against multiple tables with SQLAlchemy?
                            
                                Multi-tenant rails application: what are the pros and cons of different techniques?
                            
                                Content tagging with MongoDB
                            
                                Why am I getting a an error when creating a generated column in PostgreSQL?
                            
                                Best way to save only day and month in database

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With