Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data mart vs cubes

I've got confused with warehousing process... I'm in a process of building a data mart but the part I don't really understand is related to cubes. I've read some tutorials about SSAS but I don't see how can I use this data in other applications. What I need is following:

  • A warehouse (data mart) that contains all data needed for analysis (drill down and aggregated data, like a daily revenue and YTD revenue)
  • A .NET web service that can take this data so that many different apps can use it

A part I don't understand are cubes. I see that many people use SSAS to build cubes. What are these cubes in SSAS? Are they objects? Are they tables where data is stored? How can my web service access the data from cubes?

Are there any alternatives to SSAS? Would it be practical to just build cubes in a data mart and load them during ETL process?

like image 998
ilija veselica Avatar asked Jan 19 '23 10:01

ilija veselica


1 Answers

Cubes are preaggregated stores of data in a format to make reporting much more efficient than is possible in a Relational database store. In SSAS you have several choices for how your data is ultimately stored, but generally they are stored in files in the OS file system. They can be queried similarly to SQL (using a specialized query language called MDX) or by several other methods depending upon your version level. You can set up connections to the data for your web service using the appropriate drivers from Microsoft. I am unsure of what you are meaning by data mart. Are you referring to relational table in a star schema format? If so, these are generally precursors to the actual cube. You will not get as much benefit from a reporting standpoint by using these relational sources as you would from a cube (since a cube stores the aggregates of each node (or tuple) within the dimensional space defined by your star schema tables) To explain this, if I have a relational store (even in star schema format) and I want to get sales dollars for a particular location for a particular date, I have to run a query against a very large sales fact table and join the location and date dimesion tables (which may also be very large). If I want the same data from a cube, I define my cube filters and the datawarehouse query pulls that single tuple from the data and returns it much more quickly.

There are many alternatives to SSAS, but each would be a form of a cube if you are using a datawarehouse. If you have a large data set, a cube, properly designed will out perform a relational datamart for multidimensional queries.

like image 191
William Salzman Avatar answered Jan 24 '23 12:01

William Salzman