Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MySQL Pivot data with fixed number of columns

Tags:

mysql

pivot

The following is my SELECT statement which pivots my data nicely.

My Data Looks like this:

col_a | col_b | col_c | col_d   | Score
-------------------------------------
stuff | stuff | stuff | null    |  5
stuff | stuff | stuff | title_a |  3
stuff | stuff | stuff | title_x |  4

My current Pivot statement looks like this:

SELECT `col_a`, `col_b`, `col_c`,
    MAX(CASE `col_d` WHEN 'title_a' THEN `col_d` end) AS 'Title',
    MAX(CASE `col_d` WHEN 'title_a' THEN `score` end) AS 'Score'
    MAX(CASE `col_d` WHEN 'title_x' THEN `col_d` end) AS 'Title',
    MAX(CASE `col_d` WHEN 'title_x' THEN `score` end) AS 'Score'
    .....

This gives me the following results:

col_a | col_b | col_c | Title   | Score | Title   | Score
---------------------------------------------------------
stuff | stuff | stuff | title_a |   3   | title_x |   4

What I would like to do is check for more titles, however I only want to have four columns in the pivot. There will only ever be a maximum of 2 rows that require pivoting up to the record above. But col_d could contain any title.

For example I tried the following:

My Data now Looks like this:

col_a | col_b | col_c | col_d    | Score
-------------------------------------
stuff | stuff | stuff | null     |  5
stuff | stuff | stuff | title_a  |  3
stuff | stuff | stuff | title_x  |  4
stuff | stuff | stuff | null     |  5
stuff | stuff | stuff | title_a  |  3
stuff | stuff | stuff | title_bx |  4

My Pivot statement now looks like this:

SELECT `col_a`, `col_b`, `col_c`,
    MAX(CASE `col_d` WHEN 'title_a' THEN `col_d` end) AS 'Title',
    MAX(CASE `col_d` WHEN 'title_a' THEN `score` end) AS 'Score'
    MAX(CASE `col_d` WHEN 'title_x' THEN `col_d` end) AS 'Title',
    MAX(CASE `col_d` WHEN 'title_x' THEN `score` end) AS 'Score'
    MAX(CASE `col_d` WHEN 'title_bx' THEN `col_d` end) AS 'Second Title',
    MAX(CASE `col_d` WHEN 'title_bx' THEN `score` end) AS 'Score'
    .....

So as you can see I tried checking for another title, but that just gave me six columns, 2 of them null because in this case the two rows contained title_a and title_bx, so the middle two columns where filled with null.

The output I would like from the above data is:

col_a | col_b | col_c | Title   | Score | Title    | Score
---------------------------------------------------------
stuff | stuff | stuff | title_a |   3   | title_x  |   4
stuff | stuff | stuff | title_a |   3   | title_bx |   4

So my question is how can I check for multiple possible titles in col_d, and only have the 4 columns.

like image 959
superphonic Avatar asked Feb 10 '23 23:02

superphonic


1 Answers

This is kind of messy because MySQL doesn't have windowing functions and you want to include very specific values in the first set of Title/Score columns. You can get the final result by using some user variables to create a row number for those rows where the col_d is not equal to title_a, then join that back to your table.

The syntax will be similar to the following:

select a.col_a, a.col_b, a.col_c,
  max(case when a.col_d = 'title_a' then a.col_d end) title1,
  max(case when a.col_d = 'title_a' then a.score end) score1,
  max(case when na.col_d <> 'title_a' then na.col_d end) title2,
  max(case when na.col_d <> 'title_a' then na.score end) score2
from yourtable a
left join
(
  -- need to generate a row number value for the col_d rows
  -- that aren't equal to title_a
  select n.col_a, n.col_b, n.col_c, n.col_d,
    n.score,
    @num:=@num+1 rownum
  from yourtable n
  cross join
  (
    select @num:=0
  ) d
  where n.col_d <> 'title_a'
  order by  n.col_a, n.col_b, n.col_c, n.col_d
) na
  on a.col_a = na.col_a
  and a.col_b = na.col_b
  and a.col_c = na.col_c
  -- in the event you have more than 2 row only return 2
  and na.rownum <= 2  
where a.col_d = 'title_a'  
group by a.col_a, a.col_b, a.col_c, na.rownum;

See SQL Fiddle with Demo. This gets a result:

| COL_A | COL_B | COL_C |  TITLE1 | SCORE1 |   TITLE2 | SCORE2 |
|-------|-------|-------|---------|--------|----------|--------|
| stuff | stuff | stuff | title_a |      3 | title_bx |      4 |
| stuff | stuff | stuff | title_a |      3 |  title_x |      4 |

It was pointed out to me that if you will only ever have 2 other values, then you can simply JOIN the data and not use the user variable:

select distinct a.col_a, a.col_b, a.col_c,
  a.col_d title1,
  a.score score1,
  na.col_d title2,
  na.score score2
from yourtable a
left join
(
  select n.col_a, n.col_b, n.col_c, n.col_d,
    n.score
  from yourtable n
  where n.col_d <> 'title_a'
) na
  on a.col_a = na.col_a
  and a.col_b = na.col_b
  and a.col_c = na.col_c
where a.col_d = 'title_a';

See SQL Fiddle with Demo. This gives the same result:

| COL_A | COL_B | COL_C |  TITLE1 | SCORE1 |   TITLE2 | SCORE2 |
|-------|-------|-------|---------|--------|----------|--------|
| stuff | stuff | stuff | title_a |      3 |  title_x |      4 |
| stuff | stuff | stuff | title_a |      3 | title_bx |      4 |

Depending on what you actually have for data in col_a, col_b, and col_c you might have to alter this but it should get you the result you need.

Update: Based on your comment that you will not know the values in the col_d column but you just need to split the data into two pivoted columns, the process gets complicated because MySQL doesn't have windowing functions. This would be extremely easy if there was an NTILE function. The NTILE function distributes the rows into a specific number of groups. In this case, your data is being split into 2 groups.

I've modified the code in this blog by SO User, Quassnoi to replicate the NTILE function using user variables. The variables are used to create 2 things, a row number (used during the pivoting) and the ntile value.

The code would be modified into:

select 
  x.col_a,
  x.col_b,
  x.col_c,
  max(case when x.splitgroup = 1 then x.col_d end) as Title1,
  max(case when x.splitgroup = 1 then x.Score end) as Score1,
  max(case when x.splitgroup = 2 then x.col_d end) as Title2,
  max(case when x.splitgroup = 2 then x.Score end) as Score2
from
(
  select src.col_a, src.col_b, src.col_c, src.col_d, src.score,
    src.splitGroup,
    @row:=case when @prev=src.splitGroup then @row else 0 end +1 rownum,
    @prev:=src.splitGroup
  from
  (
    -- mimic NTILE function by splitting the total count of rows
    -- over the number of columns we want (2)
    select d.col_a, d.col_b, d.col_c, d.col_d, d.score, 
      FLOOR((@r * @n) / cnt) + 1 AS splitGroup
    from
    (
      select a.col_a, a.col_b, a.col_c, a.col_d, a.score, grp.cnt
      from yourtable a
      inner join 
      (
        select col_a, col_b, col_c, count(*) as cnt
        from yourtable
        where col_d is not null
        group by col_a, col_b, col_c
      ) grp
        on a.col_a = grp.col_a
        and a.col_b = grp.col_b
        and a.col_c = grp.col_c
      where a.col_d is not null
      order by a.col_a, a.col_b, a.col_c
    ) d
    cross join
    (
      -- @n is equal to the number of new pivoted columns we want
      select @n:=2, @group1:='N', @group2:='N', @group3:='N'
    ) v
    WHERE 
      CASE 
        WHEN @group1 <> col_a AND @group2<> col_b AND @group3 <> col_c 
          THEN @r := -1 
          ELSE 0 END IS NOT NULL
      AND (@r := @r + 1) IS NOT NULL
  ) src
  cross join
  (
    -- these vars are used to get the row number once the data is split
    -- this will be needed for the aggregate/group by on the final select
    select @row:=0, @prev:=1
  ) v2
  order by src.splitGroup
) x
group by x.col_a, x.col_b, x.col_c, x.rowNum;

See SQL Fiddle with Demo. This gives the result:

| COL_A | COL_B | COL_C |   TITLE1 | SCORE1 |   TITLE2 | SCORE2 |
|-------|-------|-------|----------|--------|----------|--------|
| stuff | stuff | stuff |  title_a |      3 | title_tt |      1 |
| stuff | stuff | stuff | title_bx |      0 | title_qq |      1 |
| stuff | stuff | stuff |  title_x |      4 |  title_a |      8 |
| stuff | stuff | stuff | title_yy |      3 |  title_h |      4 |
| stuff | stuff | stuff |  title_a |      2 |  title_o |      6 |
like image 114
Taryn Avatar answered Feb 13 '23 13:02

Taryn