Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Setting fs.default.name in core-site.xml Sets HDFS to Safemode

I installed the Cloudera CDH4 distribution on a single machine in pseudo-distributed mode and successfully tested that it was working correctly (e.g. can run MapReduce programs, insert data on the Hive server, etc.) However, if I chance the core-site.xml file to have fs.default.name set to machine name rather than localhost and restart the NameNode service, the HDFS enters safe-mode.

Before the change of fs.default.name, I ran the following to check the state of the HDFS:

$ hadoop dfsadmin -report
...
Configured Capacity: 18503614464 (17.23 GB)
Present Capacity: 13794557952 (12.85 GB)
DFS Remaining: 13790785536 (12.84 GB)
DFS Used: 3772416 (3.60 MB)
DFS Used%: 0.03%
Under replicated blocks: 2
Blocks with corrupt replicas: 0
Missing blocks: 0

Then I made the modification to core-site.xml (with the machine name being hadoop):

<property>
  <name>fs.default.name</name>
  <value>hdfs://hadoop:8020</value>
</property>

I restarted the service and reran the report.

$ sudo service hadoop-hdfs-namenode restart
$ hadoop dfsadmin -report
...
Safe mode is ON
Configured Capacity: 0 (0 B)
Present Capacity: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used: 0 (0 B)
DFS Used%: NaN%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

An interesting note is that I can still perform some HDFS commands. For example, I can run

$ hadoop fs -ls /tmp

However, if I try to read a file using hadoop fs -cat or try to place a file in the HDFS, I am told the NameNode is in safemode.

$ hadoop fs -put somefile .
put: Cannot create file/user/hadinstall/somefile._COPYING_. Name node is in safe mode.

The reason I need the fs.default.name to be set to the machine name is because I need to communicate with this machine on port 8020 (the default NameNode port). If fs.default.name is left to localhost, then the NameNode service will not listen to external connection requests.

I am at a loss as to why this is happening and would appreciate any help.

like image 851
Jake Z Avatar asked Oct 16 '13 19:10

Jake Z


People also ask

In which configuration file do we set the parameter FS default name?

xml defining fs.default.name. However, core-site. xml is used by both the Hadoop client, to get the URI of the default filesystem, as well as by the namenode, to read its address. This was surprising because my understanding was that the Hadoop namenode reads all its configuration parameters from hdfs-site.

What is FS defaultFS in hadoop?

The fs. defaultFS makes HDFS a file abstraction over a cluster, so that its root is not the same as the local system's. You need to change the value in order to create the distributed file system.

Where can I find HDFS-site xml?

These files are all found in the hadoop/conf directory. For setting HDFS you have to configure core-site. xml and hdfs-site. xml.


1 Answers

The issue stemmed from domain name resolution. The /etc/hosts file needed to be modified to point the IP address of the machine of the hadoop machine for both localhost and the fully qualified domain name.

192.168.0.201 hadoop.fully.qualified.domain.com localhost
like image 135
Jake Z Avatar answered Sep 19 '22 05:09

Jake Z