Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why should we separate dimension and facts instead of combining both in one table

need some insight in dimensional modeling or star schema.

Normally, when we design data warehouse we will have fact tables and dimension tables.

However, it does make sense to embed dimension in fact table. Especially for simple dimensions who has no other attributes AND rarely change its value.

Having dimensions in fact table will makes query run very fast and no need to maintain dimension table separately, no need to look up dimension table when doing ETL.

Is there any consideration keep dimensions separated from facts?

like image 836
rendybjunior Avatar asked Dec 25 '22 03:12

rendybjunior


1 Answers

  1. Facts have LOTS of rows. If you put an attribute of, say, length 20 in a fact, more storage is required than if you put stored just an INT surrogate key (4 bytes). More storage = bigger table = decreased performance.

  2. You almost always want to store other hierarchies and attributes against a given attribute. Even if you don't now, you might want to in future

  3. Generally in reporting you'll have a list of these attributes in a drop down to filter on. How do you get these out of a fact? SELECT DISTINCT on a very large table, which is expensive without indexes. With indexes, you're impacting your load performance.

When you put stuff in dimensions instead of facts it means you've done some kind of analysis about how this fits into the business

like image 123
Nick.McDermaid Avatar answered May 14 '23 05:05

Nick.McDermaid