Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hive always run mapred jobs in local mode

We are testing a multi node hadoop cluster (2.4.0) with Hive (0.13.0). The cluster works fine, but when we runa a query in hive, the mapred job are always executed locally. For example:

Without hive-site.xml (in fact, without any configuration file other than defaults) we set mapred.job.tracker:

hive> SET mapred.job.tracker=192.168.7.183:8032;

And run a query:

hive> select count(1) from suricata;

Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks determined at compile time: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
OpenJDK 64-Bit Server VM warning: You have loaded library /hadoop/hadoop-2.4.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now.
It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'.
14/04/29 12:48:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
14/04/29 12:48:02 WARN conf.Configuration: file:/tmp/hadoopuser/hive_2014-04-29_12-47-57_290_2455239450939088471-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
14/04/29 12:48:02 WARN conf.Configuration: file:/tmp/hadoopuser/hive_2014-04-29_12-47-57_290_2455239450939088471-1/-local-10003/jobconf.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
Execution log at: /tmp/hadoopuser/hadoopuser_20140429124747_badfcce6-620e-4718-8c3b-e4ef76bdba7e.log
Job running in-process (local Hadoop)
Hadoop job information for null: number of mappers: 0; number of reducers: 0
2014-04-29 12:48:05,450 null map = 0%,  reduce = 0%
.......
.......
2014-04-29 12:52:26,982 null map = 100%,  reduce = 100%
Ended Job = job_local1983771849_0001
Execution completed successfully
**MapredLocal task succeeded**
OK
266559841
Time taken: 270.176 seconds, Fetched: 1 row(s)

What are we missing?

like image 334
user2591846 Avatar asked Apr 29 '14 16:04

user2591846


2 Answers

Set hive.exec.mode.local.auto as false which will disable the local mode execution in Hive

like image 70
Nithin K Anil Avatar answered Nov 05 '22 18:11

Nithin K Anil


For each query the compiler generates DAG of map-reduce jobs. If the job runs in local mode, check below properties:

mapreduce.framework.name=local;
hive.exec.mode.local.auto=false;

If auto option is enabled then hive run the job in local mode if

Total input size < hive.exec.mode.local.auto.inputbytes.max
Total number of map tasks < hive.exec.mode.local.auto.tasks.max
Total number of reduce tasks =< 1 or 0

These options are available from 0.7

like image 45
Sonu Avatar answered Nov 05 '22 18:11

Sonu