Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How exactly is the value of count(*) determined in BigQuery?

I am joining a table of about 70000 rows with a slightly bigger second table through inner join each. Now count(a.business_column) and count(*) give different results. The former correctly reports back ~70000, while the latter gives ~200000. But this only happens when I select count(*) alone, when I select them together they give the same result (~70000). How is this possible?

select
   count(*)
   /*,count(a.business_column)*/

from table_a a
inner join each table_b b
   on b.key_column = a.business_column
like image 264
oulenz Avatar asked Sep 09 '15 15:09

oulenz


1 Answers

UPDATE: For a step by step explanation on how this works, see BigQuery flattens when using field with same name as repeated field instead.


To answer the title question: COUNT(*) in BigQuery is always accurate.

The caveat is that in SQL COUNT(*) and COUNT(column) have semantically different meanings - and the sample query can be interpreted in different ways.

See: http://www.xaprb.com/blog/2009/04/08/the-dangerous-subtleties-of-left-join-and-count-in-sql/

There they have this sample query:

select user.userid, count(email.subject)
from user
   inner join email on user.userid = email.userid
group by user.userid;

That query turns out to be ambigous, and the article author changes it for a more explicit one, adding this comment:

But what if that’s not what the author of the query meant? There’s no way to really know. There are several possible intended meanings for the query, and there are several different ways to write the query to express those meanings more clearly. But the original query is ambiguous, for a few reasons. And everyone who reads this query afterwards will end up guessing what the original author meant. “I think I can safely change this to…”


UPDATE: For a step by step explanation on how this works, see BigQuery flattens when using field with same name as repeated field instead.

like image 122
Felipe Hoffa Avatar answered Sep 24 '22 02:09

Felipe Hoffa