I currently have a table m of the following format:
id    scenario    period    ct
2        1          1       1  
2        1          2       1
2        1          3       1
2        1          4       1
2        2          1       1
2        2          2       1
2        2          3       1
2        2          4       1
2        3          1       1
2        3          2       1
2        3          3       1
2        3          4       1
I want to create the following table:
id    scenario    period    1    2    3    4
2        1          1       1
2        1          2            1
2        1          3                 1
2        1          4                      1
2        2          1       1
2        2          2            1
2        2          3                 1
2        2          4                      1
2        3          1       1
2        3          2            1
2        3          3                 1
2        3          4                      1
The tablefunc extension has been created in my Postgres database already. I'm currently trying to use the crosstab() function to complete the pivot. However, I'm getting a table that looks like the following:
id    scenario    period    1    2   3    4 
2        1          1       1    1   1    1
The query I tried:
SELECT * FROM crosstab(
        'SELECT id, scenario, period, ct FROM m 
            ORDER BY 1',
         'SELECT DISTINCT period FROM m 
            ORDER BY 1')
AS (id, scenario, period, 1, 2, 3, 4);
                This query generates your desired output:
SELECT id, scenario, period, p1, p2, p3, p4  -- all except aux column rn
FROM   crosstab(
  'SELECT row_number() OVER (ORDER BY id, scenario, period)::int AS rn
        , id, scenario, period, period, ct
   FROM   m
   ORDER  BY 1'
, 'VALUES (1), (2), (3), (4)'
   ) AS (rn int, id int, scenario int, period int, p1 int, p2 int, p3 int, p4 int);
Two special difficulties:
You don't have a single unique column for the row_name yet. I use row_number() to generate the surrogate key: rn. I removed it from the outer SELECT to match your desired result.
The way you tried it, id is taken to be the row_name and all input rows are aggregated into a single output row.
You want additional columns (scenario and period) in the result, which must come after the row_name and before the category. You must list period twice to get the original column additionally - redundant as though it may seem.
Basics:
Related to this particular case:
Typically, you would have a query like this:
SELECT id, scenario, p1, p2, p3, p4  -- all except aux column rn
FROM   crosstab(
  'SELECT rank() OVER (ORDER BY id, scenario)::int AS rn
        , id, scenario, period, ct
   FROM   m
   ORDER  BY 1'
, 'VALUES (1), (2), (3), (4)'
   ) AS (rn int, id int, scenario int, p1 int, p2 int, p3 int, p4 int);
With an output like this:
id   scenario   p1   p2   p3   p4
2    1          1    1    1    1
2    2          1    1    1    1
2    3          1    1    1    1
Note the use of rank() instead of row_number() to group same combinations of (id, scenario) together.
The result makes more sense if counts are not all 1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With