Let's assume I've got some table in long format, like this:
CREATE TEMP TABLE tmp (
id int,
value varchar,
id2 int,
key int);
INSERT INTO tmp VALUES
(1, 87.1, 1444, 102),
(2, 144.9, 13921, 3),
(3, 'A032333', 13921, 7),
(4, 88.9, 13921, 102),
(5, 'JDS-SJDDD', 13921, 101),
(6, 90000, 13921, 140),
(7, 101.1, 33113, 133),
(8, 'KKL-KKIDD', 33113, 101),
(9, 0, 33113, 239),
(10, 933.1, 33113, 250);
I'd need to convert this data to wide format, like this:
CREATE TEMP TABLE tmp2 (
id2 integer,
k3 numeric,
k7 varchar,
k101 varchar,
k102 numeric,
k133 numeric,
k140 int,
k239 int,
k250 numeric);
INSERT INTO tmp2 VALUES
(1444, NULL, NULL, NULL, 87.1, NULL, NULL, NULL, NULL),
(13921, 144.9, 'A032333', 'JDS-SJDDD', 88.9, NULL, 90000, NULL, NULL),
(33113, NULL, NULL, 'KKL-KKIDD', NULL, 101.1, NULL, 0, 933.1);
I have tried with multiple CASE WHEN:
SELECT id2,
CASE WHEN key = 3 THEN value END AS a3,
CASE WHEN key = 7 THEN value END AS a7,
CASE WHEN key = 101 THEN value END AS a101,
CASE WHEN key = 102 THEN value END AS a102,
CASE WHEN key = 133 THEN value END AS a133,
CASE WHEN key = 140 THEN value END AS a140,
CASE WHEN key = 239 THEN value END AS a239,
CASE WHEN key = 250 THEN value END AS a250
FROM tmp;
However, the output preserves multiple rows for id2 while it would be sufficient to preserve only one per value. How it can be adjusted? I was thinking about something like GROUP BY + COALESCE, but COALESCE looks for values across row and I need to return first not-null value with respect to columns. What's more, this approach seems to be highly cumbersome as my original data would contain around 2000 resulting columns, so specifying each column with CASE WHEN will produce large code. Is there any shortcut? If not, how it can be accomplished?
Postgres supports the filter keyword for conditional aggregation, so I recommend:
SELECT id2,
MAX(value) FILTER (WHERE key = 3) AS a3,
MAX(value) FILTER (WHERE key = 7) AS a7,
MAX(value) FILTER (WHERE key = 101) AS a101,
MAX(value) FILTER (WHERE key = 102) AS a102,
MAX(value) FILTER (WHERE key = 133) AS a133,
MAX(value) FILTER (WHERE key = 140) AS a140,
MAX(value) FILTER (WHERE key = 239) AS a239,
MAX(value) FILTER (WHERE key = 250) AS a250
FROM tmp
GROUP BY id2;
But the key idea is the GROUP BY.
You need to group by id2 and aggregate:
SELECT id2,
max(CASE WHEN key = 3 THEN value END) AS a3,
max(CASE WHEN key = 7 THEN value END) AS a7,
max(CASE WHEN key = 101 THEN value END) AS a101,
max(CASE WHEN key = 102 THEN value END) AS a102,
max(CASE WHEN key = 133 THEN value END) AS a133,
max(CASE WHEN key = 140 THEN value END) AS a140,
max(CASE WHEN key = 239 THEN value END) AS a239,
max(CASE WHEN key = 250 THEN value END) AS a250
FROM tmp
group by id2
order by id2
This will work for your sample data.
See the demo.
Results:
> id2 | a3 | a7 | a101 | a102 | a133 | a140 | a239 | a250
> ----: | :---- | :------ | :-------- | :--- | :---- | :---- | :--- | :----
> 1444 | null | null | null | 87.1 | null | null | null | null
> 13921 | 144.9 | A032333 | JDS-SJDDD | 88.9 | null | 90000 | null | null
> 33113 | null | null | KKL-KKIDD | null | 101.1 | null | 0 | 933.1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With