Hadoop cluster setup - java.net.ConnectException: Connection refused

I want to setup a hadoop-cluster in pseudo-distributed mode. I managed to perform all the setup-steps, including startuping a Namenode, Datanode, Jobtracker and a Tasktracker on my machine.

Then I tried to run some exemplary programms and faced the java.net.ConnectException: Connection refused error. I stepped back to the very first steps of running some operations in standalone mode and faced the same problem.

I performed even triple-check of all the installation steps and have no idea how to fix it. (I am new to Hadoop and a beginner Ubuntu user thus I kindly ask you for "taking it into account" if providing any guide or tip).

This is the error output I keep receiving:

hduser@marta-komputer:/usr/local/hadoop$ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.0.jar grep input output 'dfs[a-z.]+' 15/02/22 18:23:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/02/22 18:23:04 INFO client.RMProxy: Connecting to ResourceManager at / java.net.ConnectException: Call From marta-komputer/ to localhost:9000 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused     at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)     at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)     at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)     at java.lang.reflect.Constructor.newInstance(Constructor.java:408)     at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:791)     at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:731)     at org.apache.hadoop.ipc.Client.call(Client.java:1472)     at org.apache.hadoop.ipc.Client.call(Client.java:1399)     at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:232)     at com.sun.proxy.$Proxy9.delete(Unknown Source)     at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.delete(ClientNamenodeProtocolTranslatorPB.java:521)     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     at java.lang.reflect.Method.invoke(Method.java:483)     at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)     at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)     at com.sun.proxy.$Proxy10.delete(Unknown Source)     at org.apache.hadoop.hdfs.DFSClient.delete(DFSClient.java:1929)     at org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:638)     at org.apache.hadoop.hdfs.DistributedFileSystem$12.doCall(DistributedFileSystem.java:634)     at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)     at org.apache.hadoop.hdfs.DistributedFileSystem.delete(DistributedFileSystem.java:634)     at org.apache.hadoop.examples.Grep.run(Grep.java:95)     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)     at org.apache.hadoop.examples.Grep.main(Grep.java:101)     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     at java.lang.reflect.Method.invoke(Method.java:483)     at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)     at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)     at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)     at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)     at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)     at java.lang.reflect.Method.invoke(Method.java:483)     at org.apache.hadoop.util.RunJar.run(RunJar.java:221)     at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.net.ConnectException: Connection refused     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)     at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:716)     at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:530)     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:494)     at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:607)     at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:705)     at org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:368)     at org.apache.hadoop.ipc.Client.getConnection(Client.java:1521)     at org.apache.hadoop.ipc.Client.call(Client.java:1438)     ... 32 more 

etc/hadoop/hadoop-env.sh file:

# The java implementation to use. export JAVA_HOME=/usr/lib/jvm/java-8-oracle  # The jsvc implementation to use. Jsvc is required to run secure datanodes # that bind to privileged ports to provide authentication of data transfer # protocol.  Jsvc is not required if SASL is configured for authentication of # data transfer protocol using non-privileged ports. #export JSVC_HOME=${JSVC_HOME}  export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/etc/hadoop"}  # Extra Java CLASSPATH elements.  Automatically insert capacity-scheduler. for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do   if [ "$HADOOP_CLASSPATH" ]; then     export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f   else     export HADOOP_CLASSPATH=$f   fi done  # The maximum amount of heap to use, in MB. Default is 1000. #export HADOOP_HEAPSIZE= #export HADOOP_NAMENODE_INIT_HEAPSIZE=""  # Extra Java runtime options.  Empty by default. export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"  # Command specific options appended to HADOOP_OPTS when specified export HADOOP_NAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS" export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"  export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"  export HADOOP_NFS3_OPTS="$HADOOP_NFS3_OPTS" export HADOOP_PORTMAP_OPTS="-Xmx512m $HADOOP_PORTMAP_OPTS"  # The following applies to multiple commands (fs, dfs, fsck, distcp etc) export HADOOP_CLIENT_OPTS="-Xmx512m $HADOOP_CLIENT_OPTS" #HADOOP_JAVA_PLATFORM_OPTS="-XX:-UsePerfData $HADOOP_JAVA_PLATFORM_OPTS"  # On secure datanodes, user to run the datanode as after dropping privileges. # This **MUST** be uncommented to enable secure HDFS if using privileged ports # to provide authentication of data transfer protocol.  This **MUST NOT** be # defined if SASL is configured for authentication of data transfer protocol # using non-privileged ports. export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}  # Where log files are stored.  $HADOOP_HOME/logs by default. #export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER  # Where log files are stored in the secure data environment. export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}  # HDFS Mover specific parameters ### # Specify the JVM options to be used when starting the HDFS Mover. # These options will be appended to the options specified as HADOOP_OPTS # and therefore may override any similar flags set in HADOOP_OPTS # # export HADOOP_MOVER_OPTS=""  ### # Advanced Users Only! ###  # The directory where pid files are stored. /tmp by default. # NOTE: this should be set to a directory that can only be written to by  #       the user that will run the hadoop daemons.  Otherwise there is the #       potential for a symlink attack. export HADOOP_PID_DIR=${HADOOP_PID_DIR} export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}  # A string representing this instance of hadoop. $USER by default. export HADOOP_IDENT_STRING=$USER 

.bashrc file Hadoop-related fragment:


/usr/local/hadoop/etc/hadoop/core-site.xml file:

<configuration>  <property>   <name>hadoop.tmp.dir</name>   <value>/usr/local/hadoop_tmp</value>   <description>A base for other temporary directories.</description> </property>  <property> <name>fs.default.name</name> <value>hdfs://localhost:9000</value> </property>  </configuration> 

/usr/local/hadoop/etc/hadoop/hdfs-site.xml file:

<configuration> <property>       <name>dfs.replication</name>       <value>1</value>  </property>  <property>       <name>dfs.namenode.name.dir</name>       <value>file:/usr/local/hadoop_tmp/hdfs/namenode</value>  </property>  <property>       <name>dfs.datanode.data.dir</name>       <value>file:/usr/local/hadoop_tmp/hdfs/datanode</value>  </property> </configuration> 

/usr/local/hadoop/etc/hadoop/yarn-site.xml file:

<configuration>  <property>       <name>yarn.nodemanager.aux-services</name>       <value>mapreduce_shuffle</value> </property> <property>       <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>       <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration> 

/usr/local/hadoop/etc/hadoop/mapred-site.xml file:

<configuration> <property>       <name>mapreduce.framework.name</name>       <value>yarn</value> </property> <configuration> 

Running hduser@marta-komputer:/usr/local/hadoop$ bin/hdfs namenode -format results in an output as follows (I substitiute some of its part with (...)):

hduser@marta-komputer:/usr/local/hadoop$ bin/hdfs namenode -format 15/02/22 18:50:47 INFO namenode.NameNode: STARTUP_MSG:  /************************************************************ STARTUP_MSG: Starting NameNode STARTUP_MSG:   host = marta-komputer/ STARTUP_MSG:   args = [-format] STARTUP_MSG:   version = 2.6.0 STARTUP_MSG:   classpath = /usr/local/hadoop/etc/hadoop:/usr/local/hadoop/share/hadoop/common/lib/htrace-core-3.0.4.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-cli (...)2.6.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-common-2.6.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-2.6.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-app-2.6.0.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-shuffle-2.6.0.jar:/usr/local/hadoop/contrib/capacity-scheduler/*.jar STARTUP_MSG:   build = https://git-wip-us.apache.org/repos/asf/hadoop.git -r e3496499ecb8d220fba99dc5ed4c99c8f9e33bb1; compiled by 'jenkins' on 2014-11-13T21:10Z STARTUP_MSG:   java = 1.8.0_31 ************************************************************/ 15/02/22 18:50:47 INFO namenode.NameNode: registered UNIX signal handlers for [TERM, HUP, INT] 15/02/22 18:50:47 INFO namenode.NameNode: createNameNode [-format] 15/02/22 18:50:47 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Formatting using clusterid: CID-0b65621a-eab3-47a4-bfd0-62b5596a940c 15/02/22 18:50:48 INFO namenode.FSNamesystem: No KeyProvider found. 15/02/22 18:50:48 INFO namenode.FSNamesystem: fsLock is fair:true 15/02/22 18:50:48 INFO blockmanagement.DatanodeManager: dfs.block.invalidate.limit=1000 15/02/22 18:50:48 INFO blockmanagement.DatanodeManager: dfs.namenode.datanode.registration.ip-hostname-check=true 15/02/22 18:50:48 INFO blockmanagement.BlockManager: dfs.namenode.startup.delay.block.deletion.sec is set to 000:00:00:00.000 15/02/22 18:50:48 INFO blockmanagement.BlockManager: The block deletion will start around 2015 Feb 22 18:50:48 15/02/22 18:50:48 INFO util.GSet: Computing capacity for map BlocksMap 15/02/22 18:50:48 INFO util.GSet: VM type       = 64-bit 15/02/22 18:50:48 INFO util.GSet: 2.0% max memory 889 MB = 17.8 MB 15/02/22 18:50:48 INFO util.GSet: capacity      = 2^21 = 2097152 entries 15/02/22 18:50:48 INFO blockmanagement.BlockManager: dfs.block.access.token.enable=false 15/02/22 18:50:48 INFO blockmanagement.BlockManager: defaultReplication         = 1 15/02/22 18:50:48 INFO blockmanagement.BlockManager: maxReplication             = 512 15/02/22 18:50:48 INFO blockmanagement.BlockManager: minReplication             = 1 15/02/22 18:50:48 INFO blockmanagement.BlockManager: maxReplicationStreams      = 2 15/02/22 18:50:48 INFO blockmanagement.BlockManager: shouldCheckForEnoughRacks  = false 15/02/22 18:50:48 INFO blockmanagement.BlockManager: replicationRecheckInterval = 3000 15/02/22 18:50:48 INFO blockmanagement.BlockManager: encryptDataTransfer        = false 15/02/22 18:50:48 INFO blockmanagement.BlockManager: maxNumBlocksToLog          = 1000 15/02/22 18:50:48 INFO namenode.FSNamesystem: fsOwner             = hduser (auth:SIMPLE) 15/02/22 18:50:48 INFO namenode.FSNamesystem: supergroup          = supergroup 15/02/22 18:50:48 INFO namenode.FSNamesystem: isPermissionEnabled = true 15/02/22 18:50:48 INFO namenode.FSNamesystem: HA Enabled: false 15/02/22 18:50:48 INFO namenode.FSNamesystem: Append Enabled: true 15/02/22 18:50:48 INFO util.GSet: Computing capacity for map INodeMap 15/02/22 18:50:48 INFO util.GSet: VM type       = 64-bit 15/02/22 18:50:48 INFO util.GSet: 1.0% max memory 889 MB = 8.9 MB 15/02/22 18:50:48 INFO util.GSet: capacity      = 2^20 = 1048576 entries 15/02/22 18:50:48 INFO namenode.NameNode: Caching file names occuring more than 10 times 15/02/22 18:50:48 INFO util.GSet: Computing capacity for map cachedBlocks 15/02/22 18:50:48 INFO util.GSet: VM type       = 64-bit 15/02/22 18:50:48 INFO util.GSet: 0.25% max memory 889 MB = 2.2 MB 15/02/22 18:50:48 INFO util.GSet: capacity      = 2^18 = 262144 entries 15/02/22 18:50:48 INFO namenode.FSNamesystem: dfs.namenode.safemode.threshold-pct = 0.9990000128746033 15/02/22 18:50:48 INFO namenode.FSNamesystem: dfs.namenode.safemode.min.datanodes = 0 15/02/22 18:50:48 INFO namenode.FSNamesystem: dfs.namenode.safemode.extension     = 30000 15/02/22 18:50:48 INFO namenode.FSNamesystem: Retry cache on namenode is enabled 15/02/22 18:50:48 INFO namenode.FSNamesystem: Retry cache will use 0.03 of total heap and retry cache entry expiry time is 600000 millis 15/02/22 18:50:48 INFO util.GSet: Computing capacity for map NameNodeRetryCache 15/02/22 18:50:48 INFO util.GSet: VM type       = 64-bit 15/02/22 18:50:48 INFO util.GSet: 0.029999999329447746% max memory 889 MB = 273.1 KB 15/02/22 18:50:48 INFO util.GSet: capacity      = 2^15 = 32768 entries 15/02/22 18:50:48 INFO namenode.NNConf: ACLs enabled? false 15/02/22 18:50:48 INFO namenode.NNConf: XAttrs enabled? true 15/02/22 18:50:48 INFO namenode.NNConf: Maximum size of an xattr: 16384 Re-format filesystem in Storage Directory /usr/local/hadoop_tmp/hdfs/namenode ? (Y or N) Y 15/02/22 18:50:50 INFO namenode.FSImage: Allocated new BlockPoolId: BP-948369552- 15/02/22 18:50:50 INFO common.Storage: Storage directory /usr/local/hadoop_tmp/hdfs/namenode has been successfully formatted. 15/02/22 18:50:50 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0 15/02/22 18:50:50 INFO util.ExitUtil: Exiting with status 0 15/02/22 18:50:50 INFO namenode.NameNode: SHUTDOWN_MSG:  /************************************************************ SHUTDOWN_MSG: Shutting down NameNode at marta-komputer/ ************************************************************/ 

Starting dfs and yarn results in the following output:

hduser@marta-komputer:/usr/local/hadoop$ start-dfs.sh 15/02/22 18:53:05 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Starting namenodes on [localhost] localhost: starting namenode, logging to /usr/local/hadoop/logs/hadoop-hduser-namenode-marta-komputer.out localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-hduser-datanode-marta-komputer.out Starting secondary namenodes [] starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-hduser-secondarynamenode-marta-komputer.out 15/02/22 18:53:20 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable hduser@marta-komputer:/usr/local/hadoop$ start-yarn.sh starting yarn daemons starting resourcemanager, logging to /usr/local/hadoop/logs/yarn-hduser-resourcemanager-marta-komputer.out localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-hduser-nodemanager-marta-komputer.out 

Calling jps shortly after that gives:

hduser@marta-komputer:/usr/local/hadoop$ jps 11696 ResourceManager 11842 NodeManager 11171 NameNode 11523 SecondaryNameNode 12167 Jps 

netstat output:

hduser@marta-komputer:/usr/local/hadoop$ sudo netstat -lpten | grep java tcp        0      0  *               LISTEN      1001       690283      11696/java       tcp        0      0 *               LISTEN      1001       684574      11842/java       tcp        0      0 *               LISTEN      1001       680955      11842/java       tcp        0      0  *               LISTEN      1001       684531      11696/java       tcp        0      0  *               LISTEN      1001       684524      11696/java       tcp        0      0  *               LISTEN      1001       680879      11696/java       tcp        0      0  *               LISTEN      1001       687392      11696/java       tcp        0      0  *               LISTEN      1001       680951      11842/java       tcp        0      0*               LISTEN      1001       687242      11171/java       tcp        0      0  *               LISTEN      1001       680956      11842/java       tcp        0      0 *               LISTEN      1001       690252      11523/java       tcp        0      0 *               LISTEN      1001       687239      11171/java   

/etc/hosts file:       localhost       marta-komputer  # The following lines are desirable for IPv6 capable hosts ::1     ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters 



I updated the core-site.xml and now I have:

<property> <name>fs.default.name</name> <value>hdfs://marta-komputer:9000</value> </property> 

but I keep receiving the error - now starting as:

15/03/01 00:59:34 INFO client.RMProxy: Connecting to ResourceManager at / java.net.ConnectException: Call From marta-komputer.home/ to marta-komputer:9000 failed on connection exception:     java.net.ConnectException: Connection refused; For more details see:    http://wiki.apache.org/hadoop/ConnectionRefused 

I also notice that telnet localhost 9000 is not working:

hduser@marta-komputer:~$ telnet localhost 9000 Trying telnet: Unable to connect to remote host: Connection refused 
2 Answers

For me these steps worked

  1. stop-all.sh
  2. hadoop namenode -format
  3. start-all.sh
Hi Edit your conf/core-site.xml and change localhost to Use the conf below. That should work.

<configuration>   <property>  <name>fs.default.name</name>  <value>hdfs://</value> </property> 
