Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cloudera Manager installation failed to receive heartbeat from agent - to add new hosts to cluster

I try to install on Ubuntu 12.04.1 LTS the cloudera manager using standard version and when I want to add new host I get the next error:

Installation failed.Failed to receive heartbeat from agent.
Ensure that the host's hostname is configured properly.
Ensure that port 7182 is accesible on the Cloudera Manager server (check firewall rules).
Ensure that ports 9000 an 9001 are free on the host being added.
Check agent logs in /var/log/cloudera-scm-agent/ on the host being added (some of the logs can be found in the installation details).

In the /etc/hosts file I have it configured as:

127.0.0.1 localhost
127.0.0.1 hadoop-ubuntu
192.168.5.xyz hadoop-ubuntu.dana.local hadoop-ubuntu
192.168.3.xyz ro-m81.dana.local ro-m81
192.168.3.abc ro-m41.dana.local ro-m41

The following lines are desirable for IPv6 capable hosts

::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters     
The **/var/log/cloudera-scm-agent/cloudera-scm-agent.log** shows the next error::   
[09/Oct/2013 16:04:23 +0000] 4532 MainThread agent ERROR Heartbeating to 192.168.5.xyz:7182 failed.
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/agent.py", line 747, in send_heartbeat
response = self.requestor.request('heartbeat', dict(request=heartbeat))
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 145, in request
return self.issue_request(call_request, message_name, request_datum)
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 256, in issue_request
call_response = self.transceiver.transceive(call_request)
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 485, in transceive
result = self.read_framed_message()
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/avro-1.6.3-py2.6.egg/avro/ipc.py", line 489, in read_framed_message
response = self.conn.getresponse()
File "/usr/lib64/python2.6/httplib.py", line 990, in getresponse
response.begin()
File "/usr/lib64/python2.6/httplib.py", line 391, in begin
version, status, reason = self._read_status()
File "/usr/lib64/python2.6/httplib.py", line 349, in _read_status
line = self.fp.readline()
File "/usr/lib64/python2.6/socket.py", line 433, in readline
data = recv(1)
error: [Errno 104] Connection reset by peer

Please help me to find why I get this error or what I am missing.

like image 316
DanaMihai Avatar asked Nov 02 '22 13:11

DanaMihai


2 Answers

I had the same issue. This is what did the trick for me.

type ifconfig and find your ip address. not 127.0.0.1.

type $hostname and find your hostname

edit /etc/hosts file

add an entry for your ipaddress over there. something like

192.168.8.xxx   hostname.test.com   hostname

restart cloudera service. Go to sonic.test.com:7180 and try again. It should work. Even if didn't work, go to http://hostname.test.com:7180/cmf/home check the status of the hosts.

It turned out that, even though I was getting heartbeat error, the host was actually up and running.

like image 63
vishnu viswanath Avatar answered Nov 13 '22 04:11

vishnu viswanath


I Faced the same problem, then I found a solution.

I used two machines one for master and another one for slave

the master machine having the cloudera-scm-server.

I configured the /etc/hosts in both machines, finally the error gone.

Master Machine Ip is: 192.168.1.10

In Master Machine /etc/hosts

127.0.0.1       localhost

192.168.1.10     <hostname>

Slave Machine Ip is: 192.168.1.8

In Slave Machine /etc/hosts

127.0.0.1       localhost

192.168.1.8     <hostname>
like image 39
Gowtham Balusamy Avatar answered Nov 13 '22 05:11

Gowtham Balusamy