Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Alias of window function field causing a "not found" error when used in HAVING and WHERE clauses

Consider the following BigQuery query:

SELECT
    tn.object AS object_alias,
    tn.attribute1 AS attribute1_alias,
    tn.attribute2 AS attribute2_alias,
    tn.score AS score_alias,
    ROW_NUMBER() OVER (PARTITION BY attribute1_alias, attribute2_alias ORDER BY score_alias DESC) AS row_num_alias
FROM
    [datasetName.tableName] tn
HAVING # also causes error when using WHERE
    row_num_alias <= 20

In this query, the reference to the row_num_alias field in the HAVING clause is causing the following error: Field 'row_num_alias' not found. I get the same error when replacing the HAVING clause with a WHERE clause, and it seems like this error is thrown for all window functions.

Is this a bug in BigQuery? Or is there another error in my query?

Possibly related:

  • Mysterious error when combining lead function, second window function and column alias
  • https://code.google.com/p/google-bigquery/issues/detail?id=336&q=window%20alias

One workaround is to just convert this to a subquery and to move the WHERE clause outside the subquery (see below), but this seems cumbersome (and hopefully isn't necessary).

SELECT
    object_alias,
    attribute1_alias,
    attribute2_alias,
    score_alias,
    row_num_alias
FROM
    (SELECT
        tn.object AS object_alias,
        tn.attribute1 AS attribute1_alias,
        tn.attribute2 AS attribute2_alias,
        tn.score AS score_alias,
        ROW_NUMBER() OVER (PARTITION BY attribute1_alias, attribute2_alias ORDER BY score_alias DESC) AS row_num_alias
    FROM
        [datasetName.tableName] tn
    )
WHERE
    row_num_alias <= 20
like image 554
Brian Avatar asked Mar 13 '23 11:03

Brian


1 Answers

Column aliases do not work in the WHERE clause, even in BigQuery. There is no guarantee that they work in the HAVING clause either, although some databases do support that. Column aliases can be used in the ORDER BY; I think this support by the standard is part of phasing out reference-by-number.

You know the right solution, which is to use a subquery.

By the way, a HAVING clause with no GROUP BY looks very awkward. Such a construct is often used in MySQL, but only as a work-around -- subqueries in MySQL incur much more overhead than in other databases because the optimizer is not as sophisticated.

like image 102
Gordon Linoff Avatar answered Mar 16 '23 14:03

Gordon Linoff