could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation

Tags:

I don't know how to fix this error:

Vertex failed, vertexName=initialmap, vertexId=vertex_1449805139484_0001_1_00, diagnostics=[Task failed, taskId=task_1449805139484_0001_1_00_000003, diagnostics=[AttemptID:attempt_1449805139484_0001_1_00_000003_0 Info:Error: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/hadoop/gridmix-kon/input/_temporary/1/_temporary/attempt_14498051394840_0001_m_000003_0/part-m-00003/segment-121 could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget(BlockManager.java:1441)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2702)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:584)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:440)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2014)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2010)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1561)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2008)
at org.apache.hadoop.ipc.Client.call(Client.java:1411)
at org.apache.hadoop.ipc.Client.call(Client.java:1364)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
at com.sun.proxy.$Proxy17.addBlock(Unknown Source)
at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103)
at com.sun.proxy.$Proxy17.addBlock(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:361)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1439)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1261)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:525)

Any idea what's the case?

499

asked Dec 12 '15 22:12

Mona Jalal

1 Answers

This error occurs in BlockManager::chooseTarget4NewBlock() (I am referring to the latest code) code. Specific piece of code, which causes this is:

final DatanodeStorageInfo[] targets = blockplacement.chooseTarget(src,
    numOfReplicas, client, excludedNodes, blocksize, 
    favoredDatanodeDescriptors, storagePolicy);

if (targets.length < minReplication) {
  throw new IOException("File " + src + " could only be replicated to "
      + targets.length + " nodes instead of minReplication (="
      + minReplication + ").  There are "
      + getDatanodeManager().getNetworkTopology().getNumOfLeaves()
      + " datanode(s) running and "
      + (excludedNodes == null? "no": excludedNodes.size())
      + " node(s) are excluded in this operation.");
}

This occurs, when the BlockManager tries to choose a target host for storing new block of data and can not find a single host (targets.length < minReplication). minReplication is set to 1 (configuration parameter: dfs.namenode.replication.min) in hdfs-site.xml file.

This could occur due to one of the following reasons:

Data Node instances are not running
Data Node instances are unable to contact the Name Node
Data Nodes have run out of space, hence no new block of data can be allocated to them

But, in your case, error message also contains following information:

There are 4 datanode(s) running and no node(s) are excluded in this operation.

It means, there are 4 Data Nodes running and all the 4 Data Nodes were considered for placement of data, for this operation.

So, possible suspect is disk space on the Data Nodes. You can check the disk space on your Data Nodes, using the following command:

hdfs dfsadmin -report

It gives report for each of your Live Data Nodes. For e.g. in my case, I got the following:

Live datanodes (1):

Name: 192.168.56.1:50010 (192.168.56.1)
Hostname: 192.168.56.1
Decommission Status : Normal
Configured Capacity: 648690003968 (604.14 GB)
DFS Used: 193849055737 (180.54 GB)
Non DFS Used: 186164975111 (173.38 GB)
DFS Remaining: 268675973120 (250.22 GB)
DFS Used%: 29.88%
DFS Remaining%: 41.42%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sun Dec 13 17:17:34 IST 2015

Check the "DFS-Remaining" and "DFS-Remaining%". That should give you an idea about the remaining space on your Data Nodes.

You can also refer to the wiki here: https://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo, which describes the reasons for this error and ways to mitigate it.

answered Nov 03 '22 01:11

Manjunath Ballur

Related questions
                            
                                How to write subquery in select statement in hive
                            
                                How to efficiently store and query a billion rows of sensor data
                            
                                How to get the value for a variable key from a pig map?
                            
                                Creating parquet files in spark with row-group size that is less than 100
                            
                                Java Keystore PrivateKeyEntry vs trustedCertEntry
                            
                                Is it possible to run Hadoop in Pseudo-Distributed operation without HDFS?
                            
                                Specifying memory limits with hadoop
                            
                                Hadoop: How does OutputCollector work during MapReduce?
                            
                                Spark fails on big shuffle jobs with java.io.IOException: Filesystem closed
                            
                                Spark forcing log4j
                            
                                How to change user in hdfs using sparkSubmit in java
                            
                                S3 and EMR data locality [closed]
                            
                                Is "Adopting MapReduce model" = Universal answer to scalability?
                            
                                What is the closest thing to Apache Hadoop in other languages?
                            
                                "GC Overhead limit exceeded" on Hadoop .20 datanode
                            
                                Simple oozie example of hive query?
                            
                                Pig, how to refer to a field after a join and a group by
                            
                                In Hive, how can I add a column only if that column does not exist?
                            
                                Should the HBase region server and Hadoop data node on the same machine?
                            
                                Hadoop 2.6 Connecting to ResourceManager at /0.0.0.0:8032

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

could only be replicated to 0 nodes instead of minReplication (=1). There are 4 datanode(s) running and no node(s) are excluded in this operation

Tags:

hadoop

hadoop2

hadoop-yarn

hdfs

apache-tez

Mona Jalal

People also ask

1 Answers

Manjunath Ballur

Recent Activity

Donate For Us