Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where HDFS stores files locally by default?

Tags:

hadoop

hdfs

I am running hadoop with default configuration with one-node cluster, and would like to find where HDFS stores files locally.

Any ideas?

Thanks.

like image 719
crypto5 Avatar asked Mar 01 '10 19:03

crypto5


People also ask

Where does HDFS store files?

How Does HDFS Store Data? HDFS divides files into blocks and stores each block on a DataNode. Multiple DataNodes are linked to the master node in the cluster, the NameNode. The master node distributes replicas of these data blocks across the cluster.

What is the default HDFS?

The Default size of HDFS Block is : Hadoop 1.0 – 64 MB and in Hadoop 2.0 -128 MB .

Which command is used to store data on HDFS from local file system?

copyFromLocal (or) put: To copy files/folders from local file system to hdfs store. This is the most important command.

Is HDFS local?

HDFS is not your local filesystem - it is a distributed file system.


3 Answers

You need to look in your hdfs-default.xml configuration file for the dfs.data.dir setting. The default setting is: ${hadoop.tmp.dir}/dfs/data and note that the ${hadoop.tmp.dir} is actually in core-default.xml described here.

The configuration options are described here. The description for this setting is:

Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.

like image 129
Binary Nerd Avatar answered Oct 16 '22 15:10

Binary Nerd


Seems like for the current version(2.7.1) the dir is

/tmp/hadoop-${user.name}/dfs/data 

Based on dfs.datanode.data.dir, hadoop.tmp.dir setting from: http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml http://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/core-default.xml

like image 43
MaxNevermind Avatar answered Oct 16 '22 17:10

MaxNevermind


As "more recent answer" and to clarify hadoop version numbers:

If you use Hadoop 1.2.1 (or something similar), @Binary Nerd's answer is still true.

But if you use Hadoop 2.1.0-beta (or something similar), you should read the configuration documentation here and the option you want to set is: dfs.datanode.data.dir

like image 23
contradictioned Avatar answered Oct 16 '22 16:10

contradictioned