Consider the following BigQuery query:
SELECT
tn.object AS object_alias,
tn.attribute1 AS attribute1_alias,
tn.attribute2 AS attribute2_alias,
tn.score AS score_alias,
ROW_NUMBER() OVER (PARTITION BY attribute1_alias, attribute2_alias ORDER BY score_alias DESC) AS row_num_alias
FROM
[datasetName.tableName] tn
HAVING # also causes error when using WHERE
row_num_alias <= 20
In this query, the reference to the row_num_alias
field in the HAVING
clause is causing the following error: Field 'row_num_alias' not found.
I get the same error when replacing the HAVING
clause with a WHERE
clause, and it seems like this error is thrown for all window functions.
Is this a bug in BigQuery? Or is there another error in my query?
Possibly related:
One workaround is to just convert this to a subquery and to move the WHERE
clause outside the subquery (see below), but this seems cumbersome (and hopefully isn't necessary).
SELECT
object_alias,
attribute1_alias,
attribute2_alias,
score_alias,
row_num_alias
FROM
(SELECT
tn.object AS object_alias,
tn.attribute1 AS attribute1_alias,
tn.attribute2 AS attribute2_alias,
tn.score AS score_alias,
ROW_NUMBER() OVER (PARTITION BY attribute1_alias, attribute2_alias ORDER BY score_alias DESC) AS row_num_alias
FROM
[datasetName.tableName] tn
)
WHERE
row_num_alias <= 20
Column aliases do not work in the WHERE
clause, even in BigQuery. There is no guarantee that they work in the HAVING
clause either, although some databases do support that. Column aliases can be used in the ORDER BY
; I think this support by the standard is part of phasing out reference-by-number.
You know the right solution, which is to use a subquery.
By the way, a HAVING
clause with no GROUP BY
looks very awkward. Such a construct is often used in MySQL, but only as a work-around -- subqueries in MySQL incur much more overhead than in other databases because the optimizer is not as sophisticated.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With