Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why we use Dimensional Model over Denormalized relational Model?

I am confused in some questions. I need their answers. If our relational model is also De-normalize then why we prefer dimensional model ? What is the reason we prefer dimensional model over relational model ? Your historical data can also stored in OLTP and you can perform reporting easily on any OLTP then why we use dimensional model and data warehouse ? What is the difference between a dimension and a de-normalized table ?

Thanks in advance

like image 542
Hamid Mahmood Avatar asked Jul 11 '13 23:07

Hamid Mahmood


People also ask

Why should you consider dimensional model for your relational database platform?

By creating dimensional data models, you can design a database to store various data in a centralized place, then design your data in a way that works best for you. It allows and supports faster data retrieval and helps create valuable reports to improve and facilitate future business decision-making.

Why do we use a dimensional model?

It optimises the database for faster retrieval of the data. Dimensional Models have a specific structure and organise the data to generate reports that improve performance. As Dimensional Data Models deal with Dimensions and Fact tables, you'll need to understand these tables in-depth.

Is dimensional model normalized or denormalized?

Dimensional models are more denormalized and optimized for data querying, while normalized models seek to eliminate data redundancies and are optimized for transaction loading and updating.

What is the difference between dimensional model and relational model?

Abstract: Relational modelling is used for the modelling of transactional data stored in relational databases while dimensional modelling is prevalently used for the modelling of dimensional data stored in data warehouses.


2 Answers

Short answer:

If your lookups / retrievals from your OLTP tables are fast enough, and your specific search requirements do not have such complications as are described below, then there should not be a need to get into any dimensional star-schemas.

Long answer:

Dimensional and Denormalized models have different purposes. Dimensional models are generally used for data warehousing scenarios, and are particularly useful where super-fast query results are required for computed numbers such as "quarterly sales by region" or "by salesperson". Data is stored in the Dimensional model after pre-calculating these numbers, and updated as per some fixed schedule.

But even without a data warehouse involved, a Dimensional model could be useful, and its purpose could complement that of the Denormalized model, as in the following example:

A Dimensional model enables fast search. Joins between the dimension tables and the fact table are set up in a star-schema. Searching for John Smith would be simplified because we'll search for John OR Smith only in the relevant dimension table, and fetch the corresponding person ids from the fact table (fact table FKs point to dimension table PKs), thereby getting all persons with either of the 2 keywords in their name. (A further enhancement would enable us to search for all persons having variations of "John Smith" in their names e.g. John, Jon, Johnny, Jonathan, Smith, Psmith, Smythe by building snowflake dimensions.)

A Denormalized model, on the other hand, enables fast retrieval, such as returning back a lot of columns about a specific item without having to join multiple tables together.

So in the above scenario, we would first use the Dimensional model to get a set of IDs for the persons of our interest, and then use the Denormalized table to get full details of those selected IDs without having to do any further joins.

This kind of a search would be very slow if we directly query the Denormalized tables, because a text search will need to be done on the PersonName column. It becomes even slower if we try to include the name variations, or if we need to add more search criteria.

Excellent reference:

An excellent reference for learning about the vast (and very interesting) topic of Dimensional Modeling is Ralph Kimball's The Data Warehouse Lifecycle Toolkit. Its companion volume The Data Warehouse Toolkit covers a large number of actual use cases.

Hope this helps!

like image 125
Krishna Gupta Avatar answered Sep 27 '22 23:09

Krishna Gupta


A dimensional model uses denormalisation as one of its techniques in order to optimise the database for: - query performance, and - user understanding.

OLTP systems are typically hard to report from and also slow, being optimised as they are for OLTP (insert, update, delete) performance and also to protect transactional integrity. A data warehouse, using a dimensional model, still uses relational techniques but is instead optimised to consider the experience of getting the data out over getting the data in.

Truth is, you can't always report easily from any OLTP system: the tables are often obscurely titled without considering people are going to want to get at the data to make business decisions. Reporting tools that generate SQL also struggle to make performant queries on your typical normalised schema.

Modern advances in OLTP technologies provide alternatives to dimensional models that address performance issues, but still do not tackle the typical steps made in creating a dimensional model, to make the database tables easier to comprehend and navigate.

A dimension is a table that is intended to represent a business concept or entity, giving context to a particular measurement of a business process (or 'fact'). Dimensions are typically denormalised in a dimensional model both to reduce the number of tables to comprehend/navigate but also to reduce the number of joins for performance reasons. For example, a Product dimension may contact Brand information whereas in an OLTP model these would be separate tables, which allows users to filter a Fact by Brand directly without traversing multiple tables.

like image 34
Rich Avatar answered Sep 27 '22 23:09

Rich