Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HDFS file system namespace

In context of HDFS, we have Namenode and Datanode, what does it mean to say that Namenode stored the file system namespace?

Also, is the directory we specify for datanode (in hdfs-core.xml) the only place where we can store the data, or we can specify any other directory to hold the data?

like image 390
CuriousMind Avatar asked Mar 21 '14 08:03

CuriousMind


People also ask

What is the benefit of having multiple namespaces in HDFS?

Large deployments or deployments using lot of small files benefit from namespace scaling by allowing more NameNodes to be added to the cluster. Performance: File system throughput is not limited by a single NameNode. Adding more NameNodes to the cluster scales the file system read/write throughput.

How files are stored in HDFS?

How Does HDFS Store Data? HDFS divides files into blocks and stores each block on a DataNode. Multiple DataNodes are linked to the master node in the cluster, the NameNode. The master node distributes replicas of these data blocks across the cluster.

What are the two modes of implementation of HDFS file system?

It may be implemented as a distributed filesystem, or as a "local" one that reflects the locally-connected disk. The local version exists for small Hadoop instances and for testing.


1 Answers

It means that the NameNode inserts the file name into the file system tree and allocates a data block for it. This actually happens when you are trying to put the data into HDFS.

Yes it is possible to have any number of data directories. Here is what you have to set in hdfs-site.xml in the conf folder.

<property>   
    <name>dfs.data.dir</name>
    <value>path to data dir 1,path to data dir 2 etc</value> 
</property>
like image 144
DMA Avatar answered Nov 28 '22 13:11

DMA