Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Group Concat in Redshift

I have a table like this:

| Col1       | Col2        |
|:-----------|------------:|
| 1          |        a;b; |    
| 1          |        b;c; |
| 2          |        c;d; |
| 2          |        d;e; |

I want the result to be some thing like this.

| Col1       | Col2        |
|:-----------|------------:|
| 1          |       a;b;c;|
| 2          |       c;d;e;|

Is there some way to write a set function which adds unique values in a column into an array and then displays them. I am using the Redshift Database which mostly uses postgresql with the following difference: Unsupported PostgreSQL Functions

like image 804
Sohaib Avatar asked Oct 14 '14 16:10

Sohaib


1 Answers

Have a look at Redshift's listagg() function which is similar to MySQL's group_concat. You would need to split the items first and then use listagg() to give you a list of values. Do take note, though, that, as the documentation states:

LISTAGG does not support DISTINCT expressions

(Edit: As of 11th October 2018, DISTINCT is now supported. See the docs.)

So will have to take care of that yourself. Assuming you have the following table set up:

create table _test (col1 int, col2 varchar(10));
insert into _test values (1, 'a;b;'), (1, 'b;c;'), (2, 'c;d;'), (2, 'd;e;');

Fixed number of items in Col2

Perform as many split_part() operations as there are items in Col2:

select
    col1
  , listagg(col2, ';') within group (order by col2)
from (
        select col1, split_part(col2, ';', 1) as col2 from _test
  union select col1, split_part(col2, ';', 2) as col2 from _test
)
group by col1
;

Varying number of items in Col2

You would need a helper here. If there are more rows in the table than items in Col2, a workaround with row_number() could work (but is expensive for large tables):

with _helper as (
    select
        (row_number() over())::int as part_number
    from
        _test
),
_values as (
    select distinct
        col1
      , split_part(col2, ';', part_number) as col2
    from
        _test, _helper
    where
        length(split_part(col2, ';', part_number)) > 0
)
select
    col1
  , listagg(col2, ';') within group (order by col2) as col2
from
    _values
group by
    col1
;
like image 166
moertel Avatar answered Oct 13 '22 21:10

moertel