Hadoop has configuration parameter hadoop.tmp.dir
which, as per documentation, is `"A base for other temporary directories." I presume, this path refers to local file system.
I set this value to /mnt/hadoop-tmp/hadoop-${user.name}
. After formatting the namenode and starting all services, I see exactly same path created on HDFS.
Does this mean, hadoop.tmp.dir
refers to temporary location on HDFS?
As per documentation, hadoop. tmp. dir is `"A base for other temporary directories." I presume, this path refers to local file system. I set this value to /mnt/hadoop-tmp/hadoop-${user.name}. After formatting the namenode and starting all services, I see exactly same path created on HDFS.
dir: directory where HDFS data blocks are stored, with default value ${hadoop. tmp. dir}/dfs/data.
It's confusing, but hadoop.tmp.dir
is used as the base for temporary directories locally, and also in HDFS. The document isn't great, but mapred.system.dir
is set by default to "${hadoop.tmp.dir}/mapred/system"
, and this defines the Path on the HDFS where where the Map/Reduce framework stores system files.
If you want these to not be tied together, you can edit your mapred-site.xml
such that the definition of mapred.system.dir is something that's not tied to ${hadoop.tmp.dir}
Let me add a bit more to kkrugler's answer:
There're three HDFS properties which contain hadoop.tmp.dir
in their values
dfs.name.dir
: directory where namenode stores its metadata, with default value ${hadoop.tmp.dir}/dfs/name
.dfs.data.dir
: directory where HDFS data blocks are stored, with default value ${hadoop.tmp.dir}/dfs/data
.fs.checkpoint.dir
: directory where secondary namenode store its checkpoints, default value is ${hadoop.tmp.dir}/dfs/namesecondary
.This is why you saw the /mnt/hadoop-tmp/hadoop-${user.name}
in your HDFS after formatting namenode.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With