bigQuery ANY_VALUE non-deterministic?

Question

Is bigQuery ANY_VALUE deterministic? I have a query that produces ~200,000 rows of results but if I filter out duplicate entries after the query, they reduce down to about ~500. To solve that problem in the query itself, I added a GROUP BY and then wrapped all the attributes with `ANY_VALUE(tN.fieldX) as tN_fieldX . The output, when sorted, saved as .csv and executed several times, returns the same md5sum file of results.

Does this mean that the ANY_VALUE is solving my problem of duplicate entries because it would give different values every time due to being non-deterministic in bigQuery?

Mikhail Berlyant · Accepted Answer

Obviously, ANY_VALUE is non-deterministic - but if you apply the function against the GROUP'ed BY value - it kind of becomes deterministic in a sense that it randomly pickes value from a group of the same values. So, Yes- it helps in solving problem of duplicates in cases like yours

bigQuery ANY_VALUE non-deterministic?

Tags:

sql

google-bigquery

719016

1 Answers

Mikhail Berlyant

Recent Activity

Donate For Us

bigQuery ANY_VALUE non-deterministic?

Tags:

sql

google-bigquery

719016

1 Answers

Mikhail Berlyant

Related questions

Recent Activity

Donate For Us