Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hadoop namenode needs to be formatted after every computer start

Tags:

hadoop

I have searched for this problem and while there are a number of similar examples I can't find a common solution or one that works for me. I have installed Hadoop and am running in pseudo distributed mode. It works fine, and I can start and stop it a number of times and get it running fine. However, if I re-start the computer and start Hadoop the namenode doesn't start. I need to format it every time, which means I lose all the work I have done and need to start again.

I am following Hadoop: The Definitive Guide v3.

My core-site.xml says:

<configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://localhost/</value>
    </property>
</configuration>

My hdfs-site.xml says:

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

Is there a way of configuring Hadoop so that I don't need to re-format the namenode every time I restart the computer?

Thanks.

like image 356
Nick Avatar asked Nov 22 '14 22:11

Nick


1 Answers

Looks like you are not overriding the hdfs configurations dfs.name.dir , dfs.data.dir, by default it points to /tmp directory which will be cleared when your machine restarts. You have to change this from /tmp to another location in your home directory by overriding these values in your hdfs-site.xml file located in your HADOOP configuration directory.

Do the following steps

Create a directory in your home directory for keeping namenode image & datanode blocks (Replace with your login name)

mkdir /home/<USER>/pseudo/

Modify your hdfs-site.xml file in your HADOOP_CONF_DIR(hadoop configuration direcotry) as follows

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
  <name>dfs.name.dir</name>
  <value>file:///home/<USER>/pseudo/dfs/name</value>
</property>
<property>
  <name>dfs.data.dir</name>
  <value>file:///home/<USER>/pseudo/dfs/data</value>
</property>
<property>
    <name>dfs.replication</name>
    <value>1</value>
</property>

</configuration>

Format your hdfs namenode & start using

like image 76
SachinJ Avatar answered Nov 13 '22 09:11

SachinJ