Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Presto SQL pivoting (for lack of a better word) data

Tags:

sql

presto

I am working with some course data in a Presto database. The data in the table looks like:

student_id  period   score completed
1           2016_Q1  3     Y
1           2016_Q3  4     Y
3           2017_Q1  4     Y
4           2018_Q1  2     N

I would like to format the data so that it looks like:

student_id  2018_Q1_score 2018_Q1_completed 2017_Q3_score
1           0             N                 5
3           4             Y                 4
4           2             N                 2

I know that I could do this by joining to the table for each time period, but I wanted to ask here to see if any gurus had a recommendation for a more scalable solution (e.g. perhaps not having to manually create a new join for each period). Any suggestions?

like image 628
aguadamuz Avatar asked Jul 17 '18 02:07

aguadamuz


People also ask

Does Presto support PIVOT?

Use of PIVOT / UNPIVOTYou can use the PIVOT and UNPIVOT operators in standard SQL, Hive, and Presto. The PIVOT operator transforms rows into columns. The UNPIVOT operator transforms columns into rows.

What are the limitations of the PIVOT operator in T SQL?

The design of the PIVOT and UNPIVOT operators sometimes leads to bugs and pitfalls in your code. The PIVOT operator's syntax doesn't let you explicitly indicate the grouping element. If you don't realize this, you can end up with undesired grouping elements.

What is the difference between PIVOT and Unpivot?

PIVOT carries out an aggregation and merges possible multiple rows into a single row in the output. UNPIVOT doesn't reproduce the original table-valued expression result because rows have been merged. Also, null values in the input of UNPIVOT disappear in the output.

Which of the following techniques is most suited to transpose data in SQL?

Using a T-SQL Pivot function is one of the simplest method for transposing rows into columns.


1 Answers

You can just use conditional aggregation:

select student_id,
       max(case when period = '2018_Q1' then score else 0 end) as score_2018q1,
       max(case when period = '2018_Q1' then completed then 'N' end) as completed_2018q1,
       max(case when period = '2017_Q3' then score else 0 end) as score_2017q3
from t
group by student_id
like image 115
Gordon Linoff Avatar answered Nov 15 '22 09:11

Gordon Linoff