I'm working with the new version of a third party application. In this version, the database structure is changed, they say "to improve performance". The old version of the DB had a general structure like this: <pre class="prettyprint"><code>TABLE ENTITY ( ENTITY_ID, STANDARD_PROPERTY_1, STANDARD_PROPERTY_2, STANDARD_PROPERTY_3, ... ) TABLE ENTITY_PROPERTIES ( ENTITY_ID, PROPERTY_KEY, PROPERTY_VALUE ) </code></pre> so we had a main table with fields for the basic properties and a separate table to manage custom properties added by user. The new version of the DB insted has a structure like this: <pre class="prettyprint"><code>TABLE ENTITY ( ENTITY_ID, STANDARD_PROPERTY_1, STANDARD_PROPERTY_2, STANDARD_PROPERTY_3, ... ) TABLE ENTITY_PROPERTIES_n ( ENTITY_ID_n, CUSTOM_PROPERTY_1, CUSTOM_PROPERTY_2, CUSTOM_PROPERTY_3, ... ) </code></pre> So, now when the user add a custom property, a new column is added to the current <code>ENTITY_PROPERTY</code> table until the max number of columns (managed by application) is reached, then a new table is created. So, my question is: Is this a correct way to design a DB structure? Is this the only way to "increase performances"? The old structure required many join or sub-select, but this structute don't seems to me very smart (or even correct)...

No, it's not. It's terrible. <blockquote> until the max number of column (handled by application) is reached, then a new table is created. </blockquote> This sentence says it all. Under no circumstance should an application dynamically create tables. The "old" approach isn't ideal either, but since you have the requirement to let users add custom properties, it has to be like this. Consider this: <ul> <li>You lose all type-safety as you have to store all values in the column "PROPERTY_VALUE"</li> <li>Depending on your users, you could have them change the schema beforehand and then let them run some kind of database update batch job, so at least all the properties would be declared in the right datatype. Also, you could lose the entity_id/key thing. </li> <li>Check out this: http://en.wikipedia.org/wiki/Inner-platform_effect. This certainly reeks of it</li> <li>Maybe a RDBMS isn't the right thing for your app. Consider using a key/value based store like MongoDB or another NoSQL database. (http://nosql-database.org/)</li> </ul>

Is this a "correct" database design?

Tags:

performance

database

sql-server

database-design

I'm working with the new version of a third party application. In this version, the database structure is changed, they say "to improve performance".

The old version of the DB had a general structure like this:

TABLE ENTITY
(
    ENTITY_ID,
    STANDARD_PROPERTY_1,
    STANDARD_PROPERTY_2,
    STANDARD_PROPERTY_3,
    ...
)

TABLE ENTITY_PROPERTIES
(
    ENTITY_ID,
    PROPERTY_KEY,
    PROPERTY_VALUE
)

so we had a main table with fields for the basic properties and a separate table to manage custom properties added by user.

The new version of the DB insted has a structure like this:

TABLE ENTITY
(
    ENTITY_ID,
    STANDARD_PROPERTY_1,
    STANDARD_PROPERTY_2,
    STANDARD_PROPERTY_3,
    ...
)

TABLE ENTITY_PROPERTIES_n
(
    ENTITY_ID_n,
    CUSTOM_PROPERTY_1,
    CUSTOM_PROPERTY_2,
    CUSTOM_PROPERTY_3,
    ...
)

So, now when the user add a custom property, a new column is added to the current ENTITY_PROPERTY table until the max number of columns (managed by application) is reached, then a new table is created.

So, my question is: Is this a correct way to design a DB structure? Is this the only way to "increase performances"? The old structure required many join or sub-select, but this structute don't seems to me very smart (or even correct)...

310

asked May 03 '12 08:05

davioooh

2 Answers

I have seen this done before on the assumed (often unproven) "expense" of joining - it is basically turning a row-heavy data table into a column-heavy table. They ran into their own limitation, as you imply, by creating new tables when they run out of columns.

I completely disagree with it.

Personally, I would stick with the old structure and re-evaluate the performance issues. That isn't to say the old way is the correct way, it is just marginally better than the "improvement" in my opinion, and removes the need to do large scale re-engineering of database tables and DAL code.

These tables strike me as largely static... caching would be an even better performance improvement without mutilating the database and one I would look at doing first. Do the "expensive" fetch once and stick it in memory somewhere, then forget about your troubles (note, I am making light of the need to manage the Cache, but static data is one of the easiest to manage).

Or, wait for the day you run into the maximum number of tables per database :-)

Others have suggested completely different stores. This is a perfectly viable possibility and if I didn't have an existing database structure I would be considering it too. That said, I see no reason why this structure can't fit into an RDBMS. I have seen it done on almost all large scale apps I have worked on. Interestingly enough, they all went down a similar route and all were mostly "successful" implementations.

answered Oct 09 '22 09:10

Adam Houldsworth

No, it's not. It's terrible.

until the max number of column (handled by application) is reached, then a new table is created.

This sentence says it all. Under no circumstance should an application dynamically create tables. The "old" approach isn't ideal either, but since you have the requirement to let users add custom properties, it has to be like this.

Consider this:

You lose all type-safety as you have to store all values in the column "PROPERTY_VALUE"
Depending on your users, you could have them change the schema beforehand and then let them run some kind of database update batch job, so at least all the properties would be declared in the right datatype. Also, you could lose the entity_id/key thing.
Check out this: http://en.wikipedia.org/wiki/Inner-platform_effect. This certainly reeks of it
Maybe a RDBMS isn't the right thing for your app. Consider using a key/value based store like MongoDB or another NoSQL database. (http://nosql-database.org/)

answered Oct 09 '22 10:10

Dariop

Related questions
                            
                                Microsoft SQL Server - Who created a Stored Procedure?
                            
                                SQL Server and .NET: insert fails (silently!) in code but not when run manually
                            
                                In SQL Server change column of type int to type text
                            
                                Execute function inside view SQL server
                            
                                SQL Server 2008 - how to zip backup files and move to remote server
                            
                                SQL Server Deadlock Fix: Force join order, or automatically retry?
                            
                                Stored Procedure generator tool [closed]
                            
                                Update statement running for too long or not
                            
                                Copy SQL Server Express Database to Another Computer
                            
                                Slow query when connecting to linked server
                            
                                Join and count in SQL Server
                            
                                Representing date having 32 as maximum value for Day
                            
                                Using 'LIKE' with an 'IN' clause full of strings
                            
                                CTE vs View Performance in SQL Server
                            
                                How do I use transactions in Firebird?
                            
                                What is shortcut for setting primary key in SQL Server 2008?
                            
                                Sql HierarchyId How do I get the last descendants?
                            
                                Distinct Union on Select. Errors on ntext data type
                            
                                can we use CASE with EXEC
                            
                                Best Practice for Connecting ASP.NET to SQL Server

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With