I am currently having a problem with a Standard SQL query. I have a list of emails where every email can have multiple functions. See the example below on how the table looks like.
Email Function
[email protected] engineer
[email protected] specialist
[email protected] analyst
[email protected] tester
[email protected] manager
[email protected] intern
What I want is a query that returns every email once with the first function it finds. So the above table should return the following:
Email Function
[email protected] engineer
[email protected] analyst
[email protected] intern
How do I do this?
What I have right now is a simplified version of the query.
SELECT Email, Function
FROM database
GROUP BY Email, Function
The issue is here is that I have to put both Email and Function in the GROUP BY. If I only put Email in the Group By the query cannot run even though I only want the query to GROUP BY Email.
Thanks!
To do that, you can use the ROW_NUMBER() function. In OVER() , you specify the groups into which the rows should be divided ( PARTITION BY ) and the order in which the numbers should be assigned to the rows ( ORDER BY ). You assign the row numbers within each group (i.e., year).
ARRAY_AGG. Returns an ARRAY of expression values. To learn more about the optional arguments in this function and how to use them, see Aggregate function calls. To learn more about the OVER clause and how to use it, see Window function calls.
BigQuery STRING_AGG function returns a value by gathering all expressions in a row and generating a concatenated string of non-null values. The value can be either STRING or BYTES. The result generally depends on the first expression taken by the BigQuery STRING_AGG function.
There is no such thing as the "first" function, because SQL tables represent unordered sets -- especially in a designed-from-the-ground-up parallel database such as BigQuery.
You need to use some sort of aggregation function.
A simple one is any_value()
:
SELECT Email, ANY_VALUE(Function)
FROM database
GROUP BY Email;
If you have another column that specifies the ordering, then it can be used to fetch the function
associated with the minimum value of that column.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With