I need a mental process to design an OLAP database...
Essentially for standard relational it'd be (loosely):
Identify Entities
Identify Relationships
Identify Properties of Entities
For each property:
Ensure property can be related to only one entity
Ensure property is directly related to entity
For OLAP databases, I understand the terminology, the motivation and the structure; however, I have no clue as to how to decompose my relational model into an OLAP model.
OLAP is an answer engine whose goal is to put the power of finding answers into the hands of non-technical individuals. It does this by grouping measures into logical groups and allowing the user to slice the data into different views through the use of dimensions.
Star schema is widely used by all OLAP systems to design OLAP cubes efficiently.
Identify Dimensions (or By's) These are anything that you may want to analyse/group your report by. Every table in the source database is a potential Dimension. Dimensions should be hierarchical if possible, e.g. your Date dimension should have a year,month,day hierarchy, Similarly Location should have for example Country, Region, City hierarchy. This will allow your OLAP tool to more efficiently calculate aggregations.
Identify Measures These are the KPI's or the actual numerical information your client wants to see, these are usually capable of being aggregated, therefore any non flag, non key numeric field in the source database is a potential measure.
Arrange in star schema, with Measures in the center 'Fact' table, and FK relations to applicable Dimension tables. Measures should be stored at the lowest dimension hierarchy level.
Identify the 'Grain' of the fact table, this is essentially the 'level of detail' held. It is usually determined by the reporting requirements, the data granularity available in the source and performance requirements of the reporting solution.You may identify the grain as you go, or you may approach it as a final step once all the important data has been identified. I tend to have a final step to ensure the grain is consistent between my fact tables.
The final step is identifying slowly changing dimensions, and the requirements for these. For example if the customer dimension includes an element of their address and they move, how is that to be handled.
One important point in identify the Dimensions and Measures is the final cardinality that you are electing for the model. Let´s say that your relational database data entry is during all day. Maybe you don´t need to visualize or aggregate the measures by hour, even by day. You can choose a week granularity or monthly etc.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With