Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What should be hadoop.tmp.dir ?

Hadoop has configuration parameter hadoop.tmp.dir which, as per documentation, is `"A base for other temporary directories." I presume, this path refers to local file system.

I set this value to /mnt/hadoop-tmp/hadoop-${user.name}. After formatting the namenode and starting all services, I see exactly same path created on HDFS.

Does this mean, hadoop.tmp.dir refers to temporary location on HDFS?

like image 345
Shashikant Kore Avatar asked Mar 01 '10 08:03

Shashikant Kore


People also ask

What is Hadoop tmp dir?

As per documentation, hadoop. tmp. dir is `"A base for other temporary directories." I presume, this path refers to local file system. I set this value to /mnt/hadoop-tmp/hadoop-${user.name}. After formatting the namenode and starting all services, I see exactly same path created on HDFS.

Where is Hadoop temp directory?

dir: directory where HDFS data blocks are stored, with default value ${hadoop. tmp. dir}/dfs/data.


2 Answers

It's confusing, but hadoop.tmp.dir is used as the base for temporary directories locally, and also in HDFS. The document isn't great, but mapred.system.dir is set by default to "${hadoop.tmp.dir}/mapred/system", and this defines the Path on the HDFS where where the Map/Reduce framework stores system files.

If you want these to not be tied together, you can edit your mapred-site.xml such that the definition of mapred.system.dir is something that's not tied to ${hadoop.tmp.dir}

like image 196
kkrugler Avatar answered Sep 26 '22 09:09

kkrugler


Let me add a bit more to kkrugler's answer:

There're three HDFS properties which contain hadoop.tmp.dir in their values

  1. dfs.name.dir: directory where namenode stores its metadata, with default value ${hadoop.tmp.dir}/dfs/name.
  2. dfs.data.dir: directory where HDFS data blocks are stored, with default value ${hadoop.tmp.dir}/dfs/data.
  3. fs.checkpoint.dir: directory where secondary namenode store its checkpoints, default value is ${hadoop.tmp.dir}/dfs/namesecondary.

This is why you saw the /mnt/hadoop-tmp/hadoop-${user.name} in your HDFS after formatting namenode.

like image 43
darcyy Avatar answered Sep 24 '22 09:09

darcyy