I'm trying to get distinct values when using GROUP_CONCAT in BigQuery.
I'll recreate the situation using a simpler, static example:
EDIT: I've modified the example to represent better my real situation: 2 columns with group_concat which needs to be distinct:
SELECT
category,
GROUP_CONCAT(id) as ids,
GROUP_CONCAT(product) as products
FROM
(SELECT "a" as category, "1" as id, "car" as product),
(SELECT "a" as category, "2" as id, "car" as product),
(SELECT "a" as category, "3" as id, "car" as product),
(SELECT "b" as category, "4" as id, "car" as product),
(SELECT "b" as category, "5" as id, "car" as product),
(SELECT "b" as category, "2" as id, "bike" as product),
(SELECT "a" as category, "1" as id, "truck" as product),
GROUP BY
category
This example returns:
Row category ids products
1 a 1,2,3,1 car,car,car,truck
2 b 4,5,6 car,car,bike
I'd like to strip the duplicated values found, to return like:
Row category ids products
1 a 1,2,3 car,truck
2 b 4,5,6 car,bike
In MySQL, GROUP_CONCAT has a DISTINCT OPTION, but in BigQuery there isn't.
Any ideas?
Here is solution which uses UNIQUE
scope aggregation function to remove duplicates. Note, that in order to use it, first we need to build a REPEATED
using NEST
aggregation:
SELECT
GROUP_CONCAT(UNIQUE(ids)) WITHIN RECORD,
GROUP_CONCAT(UNIQUE(products)) WITHIN RECORD
FROM (
SELECT
category,
NEST(id) as ids,
NEST(product) as products
FROM
(SELECT "a" as category, "1" as id, "car" as product),
(SELECT "a" as category, "2" as id, "car" as product),
(SELECT "a" as category, "3" as id, "car" as product),
(SELECT "b" as category, "4" as id, "car" as product),
(SELECT "b" as category, "5" as id, "car" as product),
(SELECT "b" as category, "2" as id, "bike" as product),
(SELECT "a" as category, "1" as id, "truck" as product),
GROUP BY
category
)
Removing duplicates before applying group_concat will achieve the result you want:
SELECT
category,
GROUP_CONCAT(id) as ids
FROM (
SELECT category, id
FROM
(SELECT "a" as category, "1" as id),
(SELECT "a" as category, "2" as id),
(SELECT "a" as category, "3" as id),
(SELECT "b" as category, "4" as id),
(SELECT "b" as category, "5" as id),
(SELECT "b" as category, "6" as id),
(SELECT "a" as category, "1" as id),
GROUP BY
category, id
)
GROUP BY
category
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With