Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Writing to HDFS from Java, getting "could only be replicated to 0 nodes instead of minReplication"

Tags:

I’ve downloaded and started up Cloudera's Hadoop Demo VM for CDH4 (running Hadoop 2.0.0). I’m trying to write a Java program that will run from my windows 7 machine (The same machine/OS that the VM is running in). I have a sample program like:

public static void main(String[] args) {     try{         Configuration conf = new Configuration();         conf.addResource("config.xml");         FileSystem fs = FileSystem.get(conf);         FSDataOutputStream fdos=fs.create(new Path("/testing/file01.txt"), true);         fdos.writeBytes("Test text for the txt file");         fdos.flush();         fdos.close();         fs.close();     }catch(Exception e){         e.printStackTrace();     }  } 

My config.xml file only has on property defined: fs.default.name=hdfs://CDH4_IP:8020.

When I run it I’m getting the following exception:

org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are excluded in this operation.     at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)     at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)     at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)     at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)     at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)     at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)     at java.security.AccessController.doPrivileged(Native Method)     at javax.security.auth.Subject.doAs(Subject.java:396)     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687)     at org.apache.hadoop.ipc.Client.call(Client.java:1160)     at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)     at $Proxy9.addBlock(Unknown Source)     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)     at java.lang.reflect.Method.invoke(Method.java:597)     at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)     at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)     at $Proxy9.addBlock(Unknown Source)     at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:290)     at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1150)     at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1003)     at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:463) 

I’ve looked around the internet and it seem that this happens when disk space is low but that’s not the case for me when I run "hdfs dfsadmin -report" I get the following:

Configured Capacity: 25197727744 (23.47 GB) Present Capacity: 21771988992 (20.28 GB) DFS Remaining: 21770715136 (20.28 GB) DFS Used: 1273856 (1.21 MB) DFS Used%: 0.01% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 0  ------------------------------------------------- Datanodes available: 1 (1 total, 0 dead)  Live datanodes: Name: 127.0.0.1:50010 (localhost.localdomain) Hostname: localhost.localdomain Decommission Status : Normal Configured Capacity: 25197727744 (23.47 GB) DFS Used: 1273856 (1.21 MB) Non DFS Used: 3425738752 (3.19 GB) DFS Remaining: 21770715136 (20.28 GB) DFS Used%: 0.01% DFS Remaining%: 86.4% Last contact: Fri Jan 11 17:30:56 EST 201323 EST 2013 

I can also run this code just fine from with in the VM. I’m not sure what the problem is or how to fix it. This is my first time using hadoop so I’m probably missing something basic. Any ideas?

Update

The only thing I see in the logs is an exception similar to the one on get on the client:

java.io.IOException: File /testing/file01.txt could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and 1 node(s) are excluded in this operation.     at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1322)     at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2170)     at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)     at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:297)     at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java:44080)     at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:453)     at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)     at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)     at java.security.AccessController.doPrivileged(Native Method)     at javax.security.auth.Subject.doAs(Subject.java:396)     at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332)     at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1687) 

I tried changing the permissions on the data directory (/var/lib/hadoop-hdfs/cache/hdfs/dfs/data) and that didn't fix it (I went so far as giving full access to everyone).

I notice that when I'm browsing the HDFS via the HUE web app I see that the folder structure was created and that the file does exist but it is empty. I tried putting the file under the default user directory by using

FSDataOutputStream fdos=fs.create(new Path("testing/file04.txt"), true);  

instead of

FSDataOutputStream fdos=fs.create(new Path("/testing/file04.txt"), true); 

Which makes the file path become "/user/dharris/testing/file04.txt" ('dharris' is my windows user). But that gave me the same kind of error.

like image 891
David Harris Avatar asked Jan 11 '13 23:01

David Harris


People also ask

Could only be replicated to 0 nodes instead of 1?

When a file is written to HDFS, it is replicated to multiple core nodes. When you see this error, it means that the NameNode daemon does not have any available DataNode instances to write data to in HDFS. In other words, block replication is not taking place.

How does HDFS replicate data?

Suppose the HDFS file has a replication factor of three. When the local file accumulates a full block of user data, the client retrieves a list of DataNodes from the NameNode. This list contains the DataNodes that will host a replica of that block. The client then flushes the data block to the first DataNode.

What is the default replication factor in HDFS?

The default replication factor in HDFS is 3. This means that every block will have two more copies of it, each stored on separate DataNodes in the cluster.

Why data replication is important in HDFS?

Replication in HDFS increases the availability of Data at any point of time. If any node containing a block of data which is used for processing crashes, we can get the same block of data from another node this is because of replication.


2 Answers

I got a same problem.
In my case, a key of the problem was following error message.
There are 1 datanode(s) running and 1 node(s) are excluded in this operation.

It means that your hdfs-client couldn't connect to your datanode with 50010 port. As you connected to hdfs namenode, you could got a datanode's status. But, your hdfs-client would failed to connect to your datanode.

(In hdfs, a namenode manages file directories, and datanodes. If hdfs-client connect to a namnenode, it will find a target file path and address of datanode that have the data. Then hdfs-client will communicate with datanode. (You can check those datanode uri by using netstat. because, hdfs-client will be trying to communicate with datanodes using by address informed by namenode)

I solved that problem by:

  1. opening 50010(dfs.datanode.address) port in a firewall.
  2. adding propertiy "dfs.client.use.datanode.hostname", "true"
  3. adding hostname to hostfile in my client PC.

I'm sorry for my poor English skill.

like image 171
kook Avatar answered Oct 10 '22 09:10

kook


Go to linux VM and check the hostname and iP ADDRESS(use ifconfig cmd). Then in the linux vm edit /etc/host file with

IPADDRESS (SPALCE) hostname

example : 192.168.110.27 clouderavm

and change the all your hadoop configuration files like

core-site.xml

hdfs-site.xml

mapred-site.xml

yarn-site.xml

change localhost or localhost.localdomain or 0.0.0.0 to your hostname

then Restart cloudera manger.

in the windows machine edit C:\Windows\System32\Drivers\etc\hosts

add one line at the end with

you vm machine ip and hostname (same as you done on the /etc/host file in the vm)

VMIPADRESS VMHOSTNAME

example :

192.168.110.27 clouderavm

then check now, it should work, for detail configuration check following VIDEO from you tube

https://www.youtube.com/watch?v=fSGpYHjGIRY

like image 33
Chennakrishna Avatar answered Oct 10 '22 10:10

Chennakrishna