Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pivoting a non-numeric table in AWS Redshift

I have a table in amazon redshift that I want to pivot along one column. E.g.

dim1  dim2  val
x     a     4s
x     b     5v
y     a     9l
y     b     3t

Would turn into:

dim1   a    b  
x      4s   5v
y      9l   3t

Note that AWS Redshift does not support pivot, crosstab, unnest, or case extension. Also, the solution proposed by AWS in https://forums.aws.amazon.com/thread.jspa?threadID=126369 is not sufficient because it relies on using aggregation functions, and since my data does not consist of numbers, this wont work reliably (right?).

The best solution I've been able to come up with is a self join:

SELECT table.dim1, val as a, b
FROM table
WHERE dim2='a'
JOIN (
  SELECT dim1, val as b 
  FROM table
  WHERE dim2='b') AS t
ON t.dim1 = table.dim1

The problem with this solution is that you have to do one self join for each possible value of dim2, and this quickly becomes unmanageable since (in my real world case) my table is enormous and there are 20+ different values in dim2. For each new value in dim2 I have to do another self join so:

SELECT table.dim1, val as a, b, c
FROM table
WHERE dim2='a'
JOIN (
  SELECT dim1, val as b 
  FROM table
  WHERE dim2='b') AS t
ON t.dim1 = table.dim1
JOIN (
  SELECT dim1, val as c 
  FROM table
  WHERE dim2='c') AS t2
ON t2.dim1 = table.dim1

We can see how this can quickly get out of hand. Is there a better way to do this?

like image 634
Logister Avatar asked Mar 11 '26 03:03

Logister


1 Answers

It turns out that the AWS answer was sufficient, you can just aggregate over the strings by using max():

SELECT dim1, MAX(a) AS a, MAX(b) AS b
FROM (
  SELECT dim1, 
    CASE dim2 WHEN 'a' THEN val ELSE NULL END as a,
    CASE dim2 WHEN 'b' THEN val ELSE NULL END as b
  FROM table
)
GROUP BY dim1;

But with an arbitrarily large number of values in dim2 this is still cumbersome. I'm open to better answers.

like image 200
Logister Avatar answered Mar 12 '26 17:03

Logister