I am using Google Big Query, and I am trying to get a pivoted result out from public sample data set.
A simple query to an existing table is:
SELECT * FROM publicdata:samples.shakespeare LIMIT 10;
This query returns following result set.
Now what I am trying to do is, get the results from the table in such way that if the word is brave, select "BRAVE" as column_1 and if the word is attended, select "ATTENDED" as column_2, and aggregate the word count for these 2.
Here is the query that I am using.
SELECT (CASE WHEN word = 'brave' THEN 'BRAVE' ELSE '' END) AS column_1, (CASE WHEN word = 'attended' THEN 'ATTENDED' ELSE '' END) AS column_2, SUM (word_count) FROM publicdata:samples.shakespeare WHERE (word = 'brave' OR word = 'attended') GROUP BY column_1, column_2 LIMIT 10;
But, this query returns the data
What I was looking for is
I know this pivot for this data set does not make sense. But I am just taking this as an example to explain the problem. It will be great if you can put in some directions for me.
EDITED: I also referred to How to simulate a pivot table with BigQuery? and it seems it also has the same issue I mentioned here.
here is a dynamic pivot procedure in standard SQL@Bigquery. it does not aggregate yet. at first, you need to provide a table with already pe-KPI agregated values (if needed) . but it automatically creates a table and generates all the pivoted columns.
Conventionally, the Pivot operator converts the table's row data into column data. The Unpivot operator does the inverse, transforming column-based data into rows. Google BigQuery offers operators like Pivot and Unpivot that can be used to transpose the BigQuery columns to rows or vice versa.
OFFSET means that the numbering starts at zero, ORDINAL means that the numbering starts at one. A given array can be interpreted as either 0-based or 1-based. When accessing an array element, you must preface the array position with OFFSET or ORDINAL , respectively; there is no default behavior.
Update 2021:
A new PIVOT operator has been introduced into BigQuery.
Before PIVOT is used to rotate sales and quarter into Q1, Q2, Q3, Q4 columns:
product | sales | quarter |
---|---|---|
Kale | 51 | Q1 |
Kale | 23 | Q2 |
Kale | 45 | Q3 |
Kale | 3 | Q4 |
Apple | 77 | Q1 |
Apple | 0 | Q2 |
Apple | 25 | Q3 |
Apple | 2 | Q4 |
After PIVOT is used to rotate sales and quarter into Q1, Q2, Q3, Q4 columns:
product | Q1 | Q2 | Q3 | Q4 |
---|---|---|---|---|
Apple | 77 | 0 | 25 | 2 |
Kale | 51 | 23 | 45 | 3 |
Query:
with Produce AS ( SELECT 'Kale' as product, 51 as sales, 'Q1' as quarter UNION ALL SELECT 'Kale', 23, 'Q2' UNION ALL SELECT 'Kale', 45, 'Q3' UNION ALL SELECT 'Kale', 3, 'Q4' UNION ALL SELECT 'Apple', 77, 'Q1' UNION ALL SELECT 'Apple', 0, 'Q2' UNION ALL SELECT 'Apple', 25, 'Q3' UNION ALL SELECT 'Apple', 2, 'Q4') SELECT * FROM (SELECT product, sales, quarter FROM Produce) PIVOT(SUM(sales) FOR quarter IN ('Q1', 'Q2', 'Q3', 'Q4'))
To build list of columns dynamically use execute immediate
:
execute immediate ( select ''' select * from (select product, sales, quarter from Produce) pivot(sum(sales) for quarter in ("''' || string_agg(distinct quarter, '", "' order by quarter) || '''")) ''' from Produce );
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With