need some insight in dimensional modeling or star schema.
Normally, when we design data warehouse we will have fact tables and dimension tables.
However, it does make sense to embed dimension in fact table. Especially for simple dimensions who has no other attributes AND rarely change its value.
Having dimensions in fact table will makes query run very fast and no need to maintain dimension table separately, no need to look up dimension table when doing ETL.
Is there any consideration keep dimensions separated from facts?
Facts have LOTS of rows. If you put an attribute of, say, length 20 in a fact, more storage is required than if you put stored just an INT
surrogate key (4 bytes). More storage = bigger table = decreased performance.
You almost always want to store other hierarchies and attributes against a given attribute. Even if you don't now, you might want to in future
Generally in reporting you'll have a list of these attributes in a drop down to filter on. How do you get these out of a fact? SELECT DISTINCT
on a very large table, which is expensive without indexes. With indexes, you're impacting your load performance.
When you put stuff in dimensions instead of facts it means you've done some kind of analysis about how this fits into the business
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With