in my table I have an attribute called 'messages' with this exact data type:
ARRAY<STRUCT<created_time DATETIME ,`from` STRUCT<id STRING,
name STRING,email STRING>, id STRING, message STRING>>
and I have defined a UDF named my_func()
Because UDF function in Big Query don't support the type DATETIME I need to cast the attribute created_time.
So I tried this:
safe_cast ( messages as ARRAY<STRUCT<created_time STRING ,
'from` STRUCT<id STRING, name STRING, email STRING>,
id STRING, message STRING>>) as messages_casted
and I get this error
Casting between arrays with incompatible element types is not
supported: Invalid cast from...
There is an error in the way I cast an array struct?
There is some way to use UDF with this data structure or the only way is to flatten the array and do the cast?
My goal is to take the array in the JS execution environment in order to make the aggregation with JS code.
When working with JavaScript UDFs, you don't need to specify complex input data types explicitly. Instead, you can use the TO_JSON_STRING function. In your case, you can have the UDF take messages as a STRING, then parse it inside the UDF. You would call my_func(TO_JSON_STRING(messages)), for example.
Here is an example from the documentation:
CREATE TEMP FUNCTION SumFieldsNamedFoo(json_row STRING)
RETURNS FLOAT64
LANGUAGE js AS """
function SumFoo(obj) {
var sum = 0;
for (var field in obj) {
if (obj.hasOwnProperty(field) && obj[field] != null) {
if (typeof obj[field] == "object") {
sum += SumFoo(obj[field]);
} else if (field == "foo") {
sum += obj[field];
}
}
}
return sum;
}
var row = JSON.parse(json_row);
return SumFoo(row);
""";
WITH Input AS (
SELECT STRUCT(1 AS foo, 2 AS bar, STRUCT('foo' AS x, 3.14 AS foo) AS baz) AS s, 10 AS foo UNION ALL
SELECT NULL, 4 AS foo UNION ALL
SELECT STRUCT(NULL, 2 AS bar, STRUCT('fizz' AS x, 1.59 AS foo) AS baz) AS s, NULL AS foo
)
SELECT
TO_JSON_STRING(t) AS json_row,
SumFieldsNamedFoo(TO_JSON_STRING(t)) AS foo_sum
FROM Input AS t;
+---------------------------------------------------------------------+---------+
| json_row | foo_sum |
+---------------------------------------------------------------------+---------+
| {"s":{"foo":1,"bar":2,"baz":{"x":"foo","foo":3.14}},"foo":10} | 14.14 |
| {"s":null,"foo":4} | 4 |
| {"s":{"foo":null,"bar":2,"baz":{"x":"fizz","foo":1.59}},"foo":null} | 1.59 |
+---------------------------------------------------------------------+---------+
Because UDF function in Big Query don't support the type DATETIME I need to cast the attribute created_time.
Below is for BigQuery Standard SQL and is a very simple way of Casting specific element of ARRAY leaving everything else as is.
So in below example - it CASTs created_time from DATETIME to STRING (you can use any compatible type you need in your case though)
#standardSQL
SELECT messages,
ARRAY(SELECT AS STRUCT * REPLACE (SAFE_CAST(created_time AS STRING) AS created_time)
FROM UNNEST(messages) message
) casted_messages
FROM `project.dataset.table`
If you run it against your data - you will see original and casted messages - all elements should be same (value/type) with exception of (as expected) created_time which will be of casted type (STRING in this particular case) or NULL if not compatible
You can test / play with above using dummy data as below
#standardSQL
WITH `project.dataset.table` AS (
SELECT [STRUCT<created_time DATETIME,
`from` STRUCT<id STRING, name STRING, email STRING>,
id STRING,
message STRING>
(DATETIME '2018-01-01 13:00:00', ('1', 'mike', 'zzz@ccc'), 'a', 'abc'),
(DATETIME '2018-01-02 14:00:00', ('2', 'john', 'yyy@bbb'), 'b', 'xyz')
] messages
)
SELECT messages,
ARRAY(SELECT AS STRUCT * REPLACE (SAFE_CAST(created_time AS STRING) AS created_time)
FROM UNNEST(messages) message
) casted_messages
FROM `project.dataset.table`
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With