Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to design a shrunken dimension table for dates in dimensional warehouse and use in SSAS?

I'm working in a situation where we're moving from having a bunch of transactional fact tables to a more complicated picture with aggregates, snapshots, etc. In the past, there were a few cases where data needed to be aggregated by month, but previous developers had just put the key for the first day of the month it belonged to into a column in the fact table, and pointed it to the usual date dimension. This seems to work OK, we have day/month/year hierarchies in the cubes for each date dimension, and users are doing fine when they need to look at things by month.

When I read around - mostly Kimball's work, but other guides as well - the suggestion is that we should be using a "shrunken dimension" in these cases. The Kimball Group even specifically mention it in regards to a Month dimension. But I'm really not finding a whole lot of information about implementing them past that article, and brief write-ups that seem to be re-phrasing parts of it.


One of my particular concerns is that at the moment, people using our cubes are used to having one date dimension for each different type of date with year-month-day hierarchies, and they just only go down to month level when that's what they need. If this is going to result in a separate dimension with a year-month hierarchy, then it seems like it might be unwelcome clutter. But is this the intention?

The last two paragraphs in the linked article are the only thing I've found tackling how this should work in the presentation layer, and I just don't get what they're trying to describe. It feels short a couple of examples to flesh out how this should appear in a cube. Normally, I'd just trial-and-error this, but timescales are very tight. So...

  • If I do this, what is the intended display in the cube? Would I have two separate date dimensions, with one that only goes down to month?
  • If the above is correct, is there really much point in this, seeing as people can currently happily query things at a monthly level without it? I feel like I'm missing what the benefits really are. I can see it's more semantically correct (we're at month level, so holding first day of the month is hacky and will show unrelated attributes), but with users who are already used to this, I'm not convinced that's enough reason to spend more time on this just now. I can see it could perform better given it'd be a smaller dimension, but we're not having performance issues as-is. Am I missing something?
  • If I do go ahead with the changes, any tips on getting the shrunken dimension working in the cube? Normally I can dig around online until I narrow something down to the best couple of options, but there's really not much around, and I'd appreciate hearing from someone who's done this before. Not looking for anything huge, but something either written a little more technically than that article or a mini-example would probably leave me feeling a lot clearer about what needs doing and why. The Kimball article especially confused me when discussing needing to join the base dimension to the shrunken dimension in order to see attributes.

First two points are the big ones, because I'd know whether I need to make any data warehouse changes, and get them done if so - I would be very glad for an answer on those, even if you can't cover the third point.

like image 762
Jo Douglass Avatar asked Jun 07 '15 20:06

Jo Douglass


People also ask

Can dimension tables have dates?

What is a Date Dimension Table? A date dimension is an essential table in a data model that allows us to analyze performance more effectively across different time periods. It should be included in every dimensional model that contains a date or requires date intelligence as part of the analysis.

How do I create a date dimension in SQL?

A SQL script to create a Date dimension table with logic to specify fiscal dates. [Quarter] CHAR(1), [QuarterName] VARCHAR(9),--First,Second..


1 Answers

This is not an answer nor it is a Cognos fanboi reposonse. For comparison I want to highlight how multi grain facts are modelled in other tools.

http://www-01.ibm.com/support/knowledgecenter/SSWGNW_8.0.0/com.ibm.swg.im.cognos.ug_best.8.4.0.doc/ug_best_id1339multi-factmulti-grainquery.html%23multi-factmulti-grainquery

http://www.cognoise.com/index.php?topic=17992.0

In the first link:

  • The monthly table has a month key and is joined to the month in the calendar table
  • The daily table has a day key and is joined to the same calendar table
  • What the link doesn't show is that you define the hierarchy levels behind the scenes so that the tool automatically knows not to double count the monthly level data
  • The result is the tool automatically knows how to roll up the facts

I'm not a SSAS expert but it appears it does not support this kind of functionality.

If that is the case then it seems to me there is no point in modelling the data 'correctly'. By correctly I mean assigning a particular month to a fact that is only defined at a monthly level.

So far I see no problem with modelling this by assigning a particular day in the month. If the fact table is all at the same level (monthly) then we know that a date in the table represents a month. At the least you might want to put on a check constraint that ensures it's the first of the month so there is no ambiguity.

The result is when you observe monthly and daily facts at a monthly level, everything is consistent. When you observe monthly and daily facts at a daily level you see a big chunk at the start of the month. If you could use SSAS to hide the measure at this level.. problem solved.

like image 151
Nick.McDermaid Avatar answered Sep 20 '22 01:09

Nick.McDermaid