Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

BigQuery fails to save view that uses functions

We're using BigQuery with their new dialect of "standard" SQL. the new SQL supports inline functions written in SQL instead of JS, so we created a function to handle date conversion.

CREATE TEMPORARY FUNCTION
  STR_TO_TIMESTAMP(str STRING)
  RETURNS TIMESTAMP AS (PARSE_TIMESTAMP('%Y-%m-%dT%H:%M:%E*SZ', str));

It must be a temporary function as Google returns Error: Only temporary functions are currently supported; use CREATE TEMPORARY FUNCTION if you try a permanent function.

If you try to save a view with a query that uses the function inline - you get the following error: Failed to save view. No support for CREATE TEMPORARY FUNCTION statements inside views. If you try to outsmart it, and remove the function (hoping to add it during query time), you'll receive this error Failed to save view. Function not found: STR_TO_TIMESTAMP at [4:7].

Any suggestions on how to address this? We have more complex functions than the example shown.

like image 610
Ereli Avatar asked Feb 14 '17 16:02

Ereli


People also ask

What are the limitations of BigQuery views?

BigQuery views are subject to the following limitations: Views are read-only. You cannot run DML (insert, update, delete) queries against a view. The dataset that contains your view and the dataset that contains the tables referenced by the view must be in the same location. You cannot run a BigQuery job that exports data from a view.

How to create and test BigQuery object?

Create BigQuery object ( dataset, table, UDF) to meet some business requirement. Create a SQL unit test to check the object. Run SQL unit test to check the object does the job or not. If the test is passed then move on to the next SQL unit test.

Can't Run a BigQuery job that exports data from a view?

You cannot run a BigQuery job that exports data from a view. You cannot use the TableDataList JSON API method to retrieve data from a view. For more information, see Tabledata: list. You cannot mix standard SQL and legacy SQL queries when using views. A standard SQL query cannot reference a view defined using legacy SQL syntax.

What happens if integer is 0 in BigQuery?

If INTEGER is 0, the view doesn't expire. If you don't include the --expiration flag, BigQuery creates the view with the dataset's default table lifetime. DESCRIPTION is a description of the view in quotes. KEY:VALUE is the key-value pair that represents a label.


3 Answers

Since the issue was marked as resolved, BigQuery now supports permanents registration of UDFs. In order to use your UDF in a view, you'll need to first create it.

CREATE OR REPLACE FUNCTION `ACCOUNT-NAME11111.test.STR_TO_TIMESTAMP`
    (str STRING) 
    RETURNS TIMESTAMP AS (PARSE_TIMESTAMP('%Y-%m-%dT%H:%M:%E*SZ', str));
  • Note that you must use a backtick for the function's name.
  • There's no TEMPORARY in the statement, as the function will be globally registered and persisted.
  • Due to the way BigQuery handles namespaces, you must include both the project name and the dataset name (test) in the function's name.

Once it's created and working successfully, you can use it a view.

create view test.test_view as
select `ACCOUNT-NAME11111.test.STR_TO_TIMESTAMP`('2015-02-10T13:00:00Z') as ts

You can then query you view directly without explicitly specifying the UDF anywhere.

select * from test.test_view

querying a BigQuery view that uses a permanently registered UDF

like image 70
Ereli Avatar answered Oct 06 '22 23:10

Ereli


Saving a view created with a temp function is still not supported, but what you can do is plan the SQL-query (already rolled out for the latest UI), and then save it as a table. This worked for me, but I guess it depends on the query parameters you want.

##standardSQL
## JS in SQL to extract multiple h.CDs at the same time. 
CREATE TEMPORARY FUNCTION getCustomDimension(cd ARRAY<STRUCT< index INT64, 
value STRING>>, index INT64)
RETURNS STRING
LANGUAGE js AS """
     for(var i = 0; i < cd.length; i++) {
     var item = cd[i];
     if(item.index == index) {
         return item.value
  }
}
return '';
""";

SELECT DISTINCT h.page.pagePath, getcustomDimension(h.customDimensions,20), fullVisitorId,h.page.pagePathLevel1, h.page.pagePathLevel2, h.page.pagePathLevel3, getcustomDimension(h.customDimensions,3)
FROM
`XXX.ga_sessions_*`,
UNNEST(hits) AS h
WHERE
    ### rolling timeframe
    _TABLE_SUFFIX = FORMAT_DATE('%Y%m%d',DATE_SUB(CURRENT_DATE(),INTERVAL YY DAY))
    AND h.type='PAGE'

Credit for the solution goes to https://medium.com/@JustinCarmony/strategies-for-easier-google-analytics-bigquery-analysis-custom-dimensions-cad8afe7a153

like image 38
LJP Avatar answered Oct 07 '22 00:10

LJP


As per the documentation https://cloud.google.com/bigquery/docs/reference/standard-sql/data-definition-language#create_function_statement , the functionality is still in Beta phase but is doable. The functions can be viewed in the same dataset it was created and the view can be created. Please share if that worked fine for you or if you have any findings which would be helpful for others.

like image 1
Priya Agarwal Avatar answered Oct 07 '22 00:10

Priya Agarwal