Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding the differences between CUBE and ROLLUP

My assignment asked me to find out "how many invoices are written for each date?"

I was a little stuck and asked my professor for help. She emailed me a query that would answer the question, "How many stoves of each type and version have been built? For a challenge but no extra points, include the total number of stoves."

This was the query she sent me:

SELECT STOVE.Type + STOVE.Version AS 'Type+Version' , COUNT(*) AS 'The Count' FROM STOVE GROUP BY STOVE.Type + STOVE.Version WITH ROLLUP; 

So, I tweaked that query until it met my needs. This is what I came up with:

SELECT InvoiceDt , COUNT(InvoiceNbr) AS 'Number of Invoices'  FROM INVOICE  GROUP BY InvoiceDt WITH ROLLUP  ORDER BY InvoiceDt ASC; 

And it returned the following results that I wanted.

Anyway, I decided to read up on the ROLLUP clause and started with an article from Microsoft. It said that the ROLLUP clause was similar to the CUBE clause but that it was distinguished from the CUBE clause in the following way:

  1. CUBE generates a result set that shows aggregates for all combinations of values in the selected columns.
  2. ROLLUP generates a result set that shows aggregates for a hierarchy of values in the selected columns.

So, I decided to replace the ROLLUP in my query with CUBE to see what would happen. They produced the same results. I guess that's where I'm getting confused.

It seems like, if you're using the type of query that I am here, that there isn't any practical difference between the two clauses. Is that right? Or, am I not understanding something? I had thought, when I finished reading the Microsoft article, that my results should've been different using the CUBE clause.

like image 630
krebshack Avatar asked Aug 13 '11 21:08

krebshack


People also ask

What does the rollup subclause provide in terms of grouping and how does it differ from the CUBE subclause?

Unlike the CUBE subclause, ROLLUP does not create all possible grouping sets based on the dimension columns; the CUBE makes a subset of those. When generating the grouping sets, ROLLUP assumes a hierarchy among the dimension columns and only generates grouping sets based on this hierarchy.

What is CUBE operator?

CUBE operator is used in the GROUP BY clause of a SELECT statement to return a result set of multidimensional (multiple columns) nature. Example: A table product has the following records:- Apparel. Brand.

What does CUBE do in SQL?

Introduction to SQL CUBE CUBE allows you to generate subtotals like the ROLLUP extension. In addition, the CUBE extension will generate subtotals for all combinations of grouping columns specified in the GROUP BY clause.

What is CUBE group?

GROUP BY CUBE is an extension of the GROUP BY clause similar to GROUP BY ROLLUP. In addition to producing all the rows of a GROUP BY ROLLUP, GROUP BY CUBE adds all the “cross-tabulations” rows.


2 Answers

You won't see any difference since you're only rolling up a single column. Consider an example where we do

ROLLUP (YEAR, MONTH, DAY)

With a ROLLUP, it will have the following outputs:

YEAR, MONTH, DAY YEAR, MONTH YEAR () 

With CUBE, it will have the following:

YEAR, MONTH, DAY YEAR, MONTH YEAR, DAY YEAR MONTH, DAY MONTH DAY () 

CUBE essentially contains every possible rollup scenario for each node whereas ROLLUP will keep the hierarchy in tact (so it won't skip MONTH and show YEAR/DAY, whereas CUBE will)

This is why you didn't see a difference since you only had a single column you were rolling up.

Hope that helps.

like image 197
Derek Kromm Avatar answered Sep 20 '22 18:09

Derek Kromm


We can understand the difference between ROLLUP and CUBE with a simple example. Consider we have a table which contains the results of quarterly test of students. In certain cases we need to see the total corresponding to the quarter as well as the students. Here is the sample table

SELECT * INTO #TEMP FROM (     SELECT 'Quarter 1' PERIOD,'Amar' NAME ,97 MARKS     UNION ALL     SELECT 'Quarter 1','Ram',88      UNION ALL     SELECT 'Quarter 1','Simi',76      UNION ALL     SELECT 'Quarter 2','Amar',94      UNION ALL     SELECT 'Quarter 2','Ram',82      UNION ALL     SELECT 'Quarter 2','Simi',71      UNION ALL     SELECT 'Quarter 3' ,'Amar',95      UNION ALL     SELECT 'Quarter 3','Ram',83      UNION ALL     SELECT 'Quarter 3','Simi',77     UNION ALL     SELECT 'Quarter 4' ,'Amar',91      UNION ALL     SELECT 'Quarter 4','Ram',84      UNION ALL     SELECT 'Quarter 4','Simi',79 )TAB 

enter image description here

1. ROLLUP(Can find total for corresponding to one column)

(a) Get total score of each student in all quarters.

SELECT * FROM #TEMP UNION ALL SELECT PERIOD,NAME,SUM(MARKS) TOTAL FROM #TEMP GROUP BY NAME,PERIOD  WITH ROLLUP HAVING PERIOD IS NULL AND NAME IS NOT NULL  // Having is used inorder to emit a row that is the total of all totals of each student 

Following is the result of (a)

enter image description here

(b) Incase you need to get total score of each quarter

SELECT * FROM #TEMP UNION ALL SELECT PERIOD,NAME,SUM(MARKS) TOTAL FROM #TEMP GROUP BY PERIOD,NAME  WITH ROLLUP HAVING PERIOD IS NOT NULL AND NAME IS NULL 

Following is the result of (b)

enter image description here

2. CUBE(Find total for Quarter as well as students in a single shot)

SELECT PERIOD,NAME,SUM(MARKS) TOTAL FROM #TEMP GROUP BY NAME,PERIOD  WITH CUBE HAVING PERIOD IS NOT NULL OR NAME IS NOT NULL 

Following is the result of CUBE

enter image description here

Now you may be wondering about the real time use of ROLLUP and CUBE. Sometimes we need a report in which we need to see the total of each quarter and total of each student in a single shot. Here is an example

I am changing the above CUBE query slightly as we need total for both totals.

SELECT CASE WHEN PERIOD IS NULL THEN 'TOTAL' ELSE PERIOD END PERIOD, CASE WHEN NAME IS NULL THEN 'TOTAL' ELSE NAME END NAME, SUM(MARKS) MARKS INTO #TEMP2 FROM #TEMP GROUP BY NAME,PERIOD  WITH CUBE  DECLARE @cols NVARCHAR (MAX)  SELECT @cols = COALESCE (@cols + ',[' + PERIOD + ']',                 '[' + PERIOD + ']')                FROM    (SELECT DISTINCT PERIOD FROM #TEMP2) PV                  ORDER BY PERIOD       DECLARE @query NVARCHAR(MAX) SET @query = 'SELECT * FROM               (                  SELECT * FROM #TEMP2              ) x              PIVOT               (                  SUM(MARKS)                  FOR [PERIOD] IN (' + @cols + ')             ) p;'   EXEC SP_EXECUTESQL @query 

Now you will get the following result

enter image description here

like image 32
Sarath KS Avatar answered Sep 22 '22 18:09

Sarath KS