Here is my query with public dataset in BigQuery:
SELECT RANGE_BUCKET(reputation, [400000, 500000, 600000, 700000, 800000, 900000, 1000000, 1100000, 1200000]) AS reputation_group, COUNT(*) AS count
FROM `bigquery-public-data.stackoverflow.users`
Where reputation > 200000
GROUP BY 1
ORDER By 1
The result is below:
Instead of showing reputation_group as integer, how can I show the bucket's range:
0: [0-400000]
1: [400001-500000]
2: [500001-600000]
....
Thank you very much.
UPDATE: A big thank to Mikhail's answer with a minor change below:
SELECT bucket,
FORMAT('%i - %i', IFNULL(ranges[SAFE_OFFSET(bucket - 1)] + 1, 0), ranges[SAFE_OFFSET(bucket)]) AS reputation_group,
COUNT(*) AS COUNT
FROM `bigquery-public-data.stackoverflow.users`,
UNNEST([STRUCT([200000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000, 1100000, 1200000] AS ranges)]),
UNNEST([RANGE_BUCKET(reputation, ranges)]) bucket
WHERE reputation > 200000
GROUP BY 1, 2
ORDER BY bucket
Note an extra item 200000 is added to the STRUCT, which makes the result showing
200001 - 400000
instead of 0 - 400000
Below is for BigQuery Standard SQL
#standardSQL
SELECT bucket,
FORMAT('%i - %i', IFNULL(ranges[SAFE_OFFSET(bucket - 1)] + 1, 0), ranges[SAFE_OFFSET(bucket)]) AS reputation_group,
COUNT(*) AS COUNT
FROM `bigquery-public-data.stackoverflow.users`,
UNNEST([STRUCT([400000, 500000, 600000, 700000, 800000, 900000, 1000000, 1100000, 1200000] AS ranges)]),
UNNEST([RANGE_BUCKET(reputation, ranges)]) bucket
WHERE reputation > 200000
GROUP BY 1, 2
ORDER BY bucket
with result
Row bucket reputation_group COUNT
1 0 0 - 400000 198
2 1 400001 - 500000 23
3 2 500001 - 600000 13
4 3 600001 - 700000 12
5 4 700001 - 800000 4
6 5 800001 - 900000 5
7 6 900001 - 1000000 2
8 8 1100001 - 1200000 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With