Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

"My Alias" is an inline table, and so cannot be in outer part of an outer join

I am trying to run a query similar to:

Select Table1.a,Table1.b,Table2.c
From 
(Select a, max(x) as b
from Tbl1 
group by a
) as Table1
LEFT JOIN EACH Table2
ON 
Join Condition

and I get "Table1 is an inline table, and so cannot be in outer part of an outer join."

when changing the Join from Outer to Inner, it works (hence no syntax issue...).

Can someone please explain this message, and what should i do differently to avoid it?

Thanks

like image 959
N.N. Avatar asked Jan 19 '14 12:01

N.N.


1 Answers

TL;DR: Use GROUP EACH BY in the inner query.

The inner query

Select a, max(x) as b from Tbl1  group by a 

is not parallizable. You can compute partial results in parallel, but but in order to know the global max(x) for each value of a', you need to have that run in only one place. When you use JOIN EACH in the outer query, you're instructing the query engine that you need to parallelize the JOIN execution. However, you don't have a paralelizable source, so the query fails.

There are two ways to fix this: The first is to use an explicitly parallel version of the inner query -- just use GROUP EACH BY instead of GROUP BY. This can execute in parallel because the first step is to sort the underlying table by the a field, so that you can compute the global maximum per a field in a parallel worker. If the inner query is paralellizable, then the outer query can execute in parallel as well. This would look like:

Select Table1.a,Table1.b,Table2.c
From 
(Select a, max(x) as b
from Tbl1 
group EACH by a -- note the EACH keyword here
) as Table1
LEFT JOIN EACH Table2
ON 
Join Condition

The second option is to use JOIN instead of JOIN EACH on the outer query. This might run into size limits depending on the size of Table2 (since to do a non-each join, the table on the right side needs to be 'small'.).

This is admittedly sub-optimal; you have to understand how the query engine works in order to make your queries run. That said, we (on the bigquery and dremel teams) are working hard on making queries 'just work' so that you don't have to deal with this kind of issue. We're making some progress on this, a number of these types of errors have been removed in the last couple of months (for example, an inner join would have failed too until recently), but we still have a ways to go.

Thanks for bringing this to our attention. I've filed an internal bug so that we can make this work better in the future.

like image 160
Jordan Tigani Avatar answered Oct 22 '22 04:10

Jordan Tigani