Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hadoop safemode recovery - taking too long!

I have a Hadoop cluster with 18 data nodes. I restarted the name node over two hours ago and the name node is still in safe mode.

I have been searching for why this might be taking too long and I cannot find a good answer. The posting here: Hadoop safemode recovery - taking lot of time is relevant but I'm not sure if I want/need to restart the name node after making a change to this setting as that article mentions:

<property>  <name>dfs.namenode.handler.count</name>  <value>3</value>  <final>true</final> </property> 

In any case, this is what I've been getting in 'hadoop-hadoop-namenode-hadoop-name-node.log':

2011-02-11 01:39:55,226 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 8020, call delete(/tmp/hadoop-hadoop/mapred/system, true) from 10.1.206.27:54864: error: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /tmp/hadoop-hadoop/mapred/system. Name node is in safe mode. The reported blocks 319128 needs additional 7183 blocks to reach the threshold 0.9990 of total blocks 326638. Safe mode will be turned off automatically. org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /tmp/hadoop-hadoop/mapred/system. Name node is in safe mode. The reported blocks 319128 needs additional 7183 blocks to reach the threshold 0.9990 of total blocks 326638. Safe mode will be turned off automatically.     at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1711)     at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1691)     at org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:565)     at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     at java.lang.reflect.Method.invoke(Method.java:616)     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966)     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962)     at java.security.AccessController.doPrivileged(Native Method)     at javax.security.auth.Subject.doAs(Subject.java:416)     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:960) 

Any advice is appreciated. Thanks!

like image 986
senile_genius Avatar asked Feb 11 '11 07:02

senile_genius


People also ask

How do I get hadoop out of safe mode?

Use "hdfs dfsadmin -safemode leave" to turn safe mode off.


1 Answers

I had it once, where some blocks were never reported in. I had to forcefully let the namenode leave safemode (hadoop dfsadmin -safemode leave) and then run an fsck to delete missing files.

like image 182
xinit Avatar answered Sep 27 '22 15:09

xinit