I am getting:
FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
While trying to make a copy of a partitioned table using the commands in the hive console:
CREATE TABLE copy_table_name LIKE table_name; INSERT OVERWRITE TABLE copy_table_name PARTITION(day) SELECT * FROM table_name;
I initially got some semantic analysis errors and had to set:
set hive.exec.dynamic.partition=true set hive.exec.dynamic.partition.mode=nonstrict
Although I'm not sure what the above properties do?
Full ouput from hive console:
Total MapReduce jobs = 1 Launching Job 1 out of 1 Number of reduce tasks determined at compile time: 1 In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: set mapred.reduce.tasks=<number> Starting Job = job_201206191101_4557, Tracking URL = http://jobtracker:50030/jobdetails.jsp?jobid=job_201206191101_4557 Kill Command = /usr/lib/hadoop/bin/hadoop job -Dmapred.job.tracker=master:8021 -kill job_201206191101_4557 2012-06-25 09:53:05,826 Stage-1 map = 0%, reduce = 0% 2012-06-25 09:53:53,044 Stage-1 map = 100%, reduce = 100% Ended Job = job_201206191101_4557 with errors FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
This issue generally occurs when the memory configured for the 'Map' task is insufficient to complete the task. To resolve the issue, it would be required to increase the memory for 'Map Task' from existing value. Hope this will help you.
Hive enables data summarization, querying, and analysis of data. Hive queries are written in HiveQL, which is a query language similar to SQL. Hive allows you to project structure on largely unstructured data. After you define the structure, you can use HiveQL to query the data without knowledge of Java or MapReduce.
Hive allows users to read, write, and manage petabytes of data using SQL. Hive is built on top of Apache Hadoop, which is an open-source framework used to efficiently store and process large datasets. As a result, Hive is closely integrated with Hadoop, and is designed to work quickly on petabytes of data.
Issue the SHOW TABLES command to see the views or tables that exist within workspace. Switch to the Hive schema and issue the SHOW TABLES command to see the Hive tables that exist. Switch to the HBase schema and issue the SHOW TABLES command to see the HBase tables that exist within the schema.
That's not the real error, here's how to find it:
Go to the hadoop jobtracker web-dashboard, find the hive mapreduce jobs that failed and look at the logs of the failed tasks. That will show you the real error.
The console output errors are useless, largely beause it doesn't have a view of the individual jobs/tasks to pull the real errors (there could be errors in multiple tasks)
Hope that helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With