Hadoop has configuration parameter hadoop.tmp.dir which, as per documentation, is `"A base for other temporary directories." I presume, this path refers to local file system.
I set this value to /mnt/hadoop-tmp/hadoop-${user.name}. After formatting the namenode and starting all services, I see exactly same path created on HDFS.
Does this mean, hadoop.tmp.dir refers to temporary location on HDFS?
As per documentation, hadoop. tmp. dir is `"A base for other temporary directories." I presume, this path refers to local file system. I set this value to /mnt/hadoop-tmp/hadoop-${user.name}. After formatting the namenode and starting all services, I see exactly same path created on HDFS.
dir: directory where HDFS data blocks are stored, with default value ${hadoop. tmp. dir}/dfs/data.
It's confusing, but hadoop.tmp.dir is used as the base for temporary directories locally, and also in HDFS. The document isn't great, but mapred.system.dir is set by default to "${hadoop.tmp.dir}/mapred/system", and this defines the Path on the HDFS where where the Map/Reduce framework stores system files.
If you want these to not be tied together, you can edit your mapred-site.xml such that the definition of mapred.system.dir is something that's not tied to ${hadoop.tmp.dir}
Let me add a bit more to kkrugler's answer:
There're three HDFS properties which contain hadoop.tmp.dir in their values
dfs.name.dir: directory where namenode stores its metadata, with default value ${hadoop.tmp.dir}/dfs/name.dfs.data.dir: directory where HDFS data blocks are stored, with default value ${hadoop.tmp.dir}/dfs/data.fs.checkpoint.dir: directory where secondary namenode store its checkpoints, default value is ${hadoop.tmp.dir}/dfs/namesecondary.This is why you saw the /mnt/hadoop-tmp/hadoop-${user.name} in your HDFS after formatting namenode.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With