Why should we separate dimension and facts instead of combining both in one table

Question

need some insight in dimensional modeling or star schema.

Normally, when we design data warehouse we will have fact tables and dimension tables.

However, it does make sense to embed dimension in fact table. Especially for simple dimensions who has no other attributes AND rarely change its value.

Having dimensions in fact table will makes query run very fast and no need to maintain dimension table separately, no need to look up dimension table when doing ETL.

Is there any consideration keep dimensions separated from facts?

Nick.McDermaid · Accepted Answer

Facts have LOTS of rows. If you put an attribute of, say, length 20 in a fact, more storage is required than if you put stored just an INT surrogate key (4 bytes). More storage = bigger table = decreased performance.
You almost always want to store other hierarchies and attributes against a given attribute. Even if you don't now, you might want to in future
Generally in reporting you'll have a list of these attributes in a drop down to filter on. How do you get these out of a fact? SELECT DISTINCT on a very large table, which is expensive without indexes. With indexes, you're impacting your load performance.

When you put stuff in dimensions instead of facts it means you've done some kind of analysis about how this fits into the business

Why should we separate dimension and facts instead of combining both in one table

Tags:

data-warehouse

star-schema

dimensional-modeling

rendybjunior

1 Answers

Nick.McDermaid

Recent Activity

Donate For Us

Why should we separate dimension and facts instead of combining both in one table

Tags:

data-warehouse

star-schema

dimensional-modeling

rendybjunior

1 Answers

Nick.McDermaid

Related questions

Recent Activity

Donate For Us