Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

FAILED: Error in semantic analysis: Column Found in more than One Tables/Subqueries

Tags:

hadoop

hive

CREATE EXTERNAL TABLE old_events
(day STRING, foo STRING, count STRING, internal_id STRING)
PARTITIONED BY (ds string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION '${INPUT}';

CREATE EXTERNAL TABLE events
(internal_id, foo STRING, count STRING)
PARTITIONED BY (ds string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION '${OUTPUT}';

INSERT OVERWRITE TABLE events
SELECT e2.internal_id, e2.foo, count(e1.foo)
FROM old_events e2
LEFT OUTER JOIN old_events e1
ON e1.foo = e2.foo
WHERE e1.event = 'event1'
AND e2.event = 'event2'
AND ds = date_sub('${DAY}',1)
GROUP BY e2.internal_id, e2.foo;

FAILED: Error in semantic analysis: Column ds Found in more than One Tables/Subqueries

I am getting this error, when adding ds variable that fetches current date. How do i implemented with the above script to work for date partition.

like image 946
Brisi Avatar asked May 17 '13 12:05

Brisi


1 Answers

You need to add an alias to ds in your WHERE clause. ex, ds = date_sub('${DAY}',1) to e2.ds = date_sub('${DAY}',1).

To clarify your problem a little bit, here is a smaller example which shows the same behavior

CREATE EXTERNAL TABLE example
(a INT, b INT)
LOCATION '${OUTPUT}';

SELECT *
FROM example e1
JOIN example e2
  ON e1.a = e2.a
WHERE b = 5;

This creates the same error:

FAILED: SemanticException Column b Found in more than One Tables/Subqueries

The problem is that column b exists in both example aliased as e1 and e2. You and I might know that if you join example on itself on column a then e1.b is the same as e2.b so it should not need an alias, but Hive does not know this so you need to pick one to remove any ambiguity. It doesn't matter here whether b is a partition column or not.

like image 120
Daniel Koverman Avatar answered Oct 16 '22 09:10

Daniel Koverman