How do I format and add files to hadoop after it crashed?

Question

I'm running a single-node cluster using hadoop version 1.0.1 and Ubuntu linux 11.10. I was running a simple script when it crashed, probably because my computer went to sleep. I tried to reformat the file system using

bin/hadoop namenode -format

and got the following error:

ERROR namenode.NameNode: java.io.IOException: Cannot lock storage /app/hadoop/tmp/dfs/name. The directory is already locked. at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:602)

I try to add the input files using the command:

bin/hadoop fs -copyFromLocal dataDirectory/*.txt inputDirectory

and get the error:

12/04/15 09:05:21 WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /home/hduser/input/book1.txt could only be replicated to 0 nodes, instead of 1

12/04/15 09:05:21 WARN hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null 12/04/15 09:05:21 WARN hdfs.DFSClient: Could not get block locations. Source file "/home/hduser/input/book1.txt" - Aborting...

Afterwards, I see the files in the input directory, but their sizes are 0. Any ideas on how I can add the files? I was able to add the files before hadoop crashed, so I can reinstall linux and hadoop, but it seems like overkill. Thanks.

Infinity · Accepted Answer

You need to stop the hadoop first using

bin/stop-all.sh

then try to format the file sytem since the hadoop (name node and data node still running) it locks the file system, can that give that error.

so if after giving the command bin/stop-all.sh just kill the process for that what you can do is that give the command " jps " at the shell and that will show you the processes (java processes) and will have a pid related to each process you can give command " kill processno" like "kill 23232", like this kill all the processes, and delete the hdfs file system folder you have specified, using the command you said.

And also check the disk space is enough available, suppose you have installed the ubuntu inside windows you can get more space by specifying your file system inside /host/and some folder.

Note : You dont need to format the hdfs as you can just stop all namenode and the data node and again start the hadoop processes, as it does not get currupted frequently, after stopping and starting hadoop if it gives any error then you format the file system.

Hope this will help you......

Tejas Patil · Answer

Try to manually delete the directories which store data for your namenode. This is configured by properties in mapred-site.xml like mapred.local.dir, mapred.system.dir etc. After this, stop hadoop, re-format namenode and try again. If still facing issue, then goto step 2
Try setting namenode configurations to some other paths instead of current one. After this, stop hadoop, re-format namenode and try again. If still facing issue, then goto step 3
Verify if ample disk space is present. If not then create some space in partition where namenode is configured. If still facing issue, then goto step 4
In hdfs.site.xml, set dfs.replication to 0. After this, stop hadoop, re-format namenode and try again.

If still facing issue, then please let me know along with the error/exception that you get.

How do I format and add files to hadoop after it crashed?

Tags:

hadoop

user1106278

2 Answers

Infinity

Tejas Patil

Recent Activity

Donate For Us

How do I format and add files to hadoop after it crashed?

Tags:

hadoop

user1106278

2 Answers

Infinity

Tejas Patil

Related questions

Recent Activity

Donate For Us