Which is more efficient: One long Single Table or Distributed Table? and Why?

Question

This question is all about performance and I would appreciate if the answers are specific to the case I provide.

Which is more appropriate performance-wise?

creating a table with too many fields
creating more than one table and distributing similar fields to them

CASE: An Extensive Web CMS Module

Pattern 1: Long but one table

cms
-----------------------------------------------
Id
Title
Description
Images
Order
Status
Publish
meta_keywords
meta_description
meta_author

Cleary, most the Open Source CMS like joomla use the above pattern. But i think, that pattern is killing the spirit of RDBMS. We can easily separate the content, configuration and meta of a particular article to different tables. Like the following

Pattern 2: Many but related table

Cms_content         cms_meta        cms_configuration
---------------------------------------------------------------------------
Id                  id              id          
Title               content_id      content_id
Description         keywords        status
Content             description     order
Images              author          publish

Note: Relations in this case is one-to-one

Which is the proper pattern to follow? Why choose a long but one table, or why not to choose distributed tables, over the single table?

Tudor Constantin · Accepted Answer

The only possible plausible causes for having denormalized data (one table with many columns) I can think of, are:

laziness in writing SQL JOINs
possible performance improvements on read statements

I like to go for the normalised version all the time, because:

I can be sure of data integrity
I can extract easily information from the DB (for example, how many posts have some meta, how many distinct metas there are, etc)

lqez · Answer

I think the key of performance on 'modern' - I don't know much about the meaning of 'modern', but - RDBMS based application not only depends on database schema.

Database settings : memory usage strategy, key buffer size, query cache size, etc.
Distribution on data/processing : partitioning, grid processing.
Cache strategy : using embedded cache engine or other( like memcached ).
Hardware performance

So, estimating performance is not a simple problem. Even a table with 100 fields can be fitted in memory, but also even two-fields-table may cannot be. A query for 5M rows can be done under one minute, but sometime same query does not end for 10 mins on 10M rows (only twice!) - it depends on environment that I mentioned above.

Thus, I think we cannot choose the best practice for entire cases. For your example, the key is dangled on DBA's taste. (not joke)

Which is more efficient: One long Single Table or Distributed Table? and Why?

Tags:

mysql

database-design

Starx

2 Answers

Tudor Constantin

lqez

Recent Activity

Donate For Us

Which is more efficient: One long Single Table or Distributed Table? and Why?

Tags:

mysql

database-design

Starx

2 Answers

Tudor Constantin

lqez

Related questions

Recent Activity

Donate For Us