I've followed the installation procedure from here and when I reach the Inspect Role Assignments stage I only see one managed host: localhost.localdomain
.
Any subsequent attempts to add other hosts have the same outcome:
What am I missing?
Update: I don't like to answer my own questions so I am writing my answer here.
The solution is so obvious that I cloud not see it and left the problem unresolved for quite some time until it hit me while doing some checks.
The hostname
provided at installation time was set in /etc/hosts
for the IP 127.0.0.1
and localhost.localdomain
witch was misleading for the Cloudera setup and basically made all hosts to have the same IP and hostname.
I've redone the setup with hostname.domain.local
and now the hosts
file feature a separate line with the specific IP and hostname and the /etc/resolv.conf
file has line with search domain.local
.
Even thou after this unpleasant experience I think that the installation documentation should feature these small details but, it's like stating the obvious.
Looks like Cloudera (possibly recently) added a blurb about this to their documentation. I've been having this problem for a while, and the key for me was getting the following command to give correct results:
$ host -v -t A `hostname`
My soution involved setting up a local DNS server, but perhaps just having the same /etc/hosts on every node would have been sufficient. YMMV.
Allright i implemented cluster on virtual machines so i wanted to share all i did. in my cluster i created one manager node(only for cloudera manager), one namenode, two datanode. This made adding new node to cluster easier and without problem. i also prepared simple document for instructions. It maybe little summerized but working ok. Most of the codes are taken from various sites so i tried to keep them simple as much as i understand. I added this answer here because my implementation is also including adding new host to cluster.
Note: i am very new to linux environment, i tried my best to do things, i am expecting any one who can correct my comments on usage or explainings.
==================================================================================
These instructions are implemented on cenTOS 6.2 x64 (non live desktop version). If you use server version then you may need to configure network configuration by yourself.
Use same version on all machine as much as possible. Some says IP values of machines are important but i implemented with different IP ranges like one machine is using 192.168.12.13 and other is 192.168.13.144. it is not creating problem.
I also used Oracle VirtualBox for virtual machine environment on windows 7 enterprise.
Suggestion : when you create one common cenTOS installation then you should create a clone if any wrong configuration happens. Keep a backup clone always.
Download these files manually first:
cloudera manager (you can download community edition). we need this for master node but that does not mean that master node is part of cluster. I
used manager on machine which has no namenode or job tracker, just mamanger applicaiton.
Oracle JDK. you can download proper one from oracle web site. Just go there and download from browser or copy the link and use wget to download it. It is your choise.
Be sure to uninstall "open jdk" :
yum remove java-1.6.0-openjdk
install "oracle jdk" manualy Note that wget line can be changed. you can download file from browser.
wget http://download.oracle.com/otn-pub/java/jdk/6u27-b07/jdk-6u27-linux-x64-rpm.bin
chmod u+x jdk-6u27-linux-x64-rpm.bin
./jdk-6u27-linux-x64-rpm.bin
Make our system and browsers use our new java
/usr/sbin/alternatives --install /usr/bin/java java /usr/java/default/bin/java 20000
/usr/sbin/alternatives --install /usr/lib/mozilla/plugins/libjavaplugin.so libjavaplugin.so /usr/java/default/jre/lib/i386/libnpjp2.so 20000
Add user as sudoers
nano /etc/sudoers
find the line "root ALL=(ALL) ALL" and add this line below
username ALL=(ALL) ALL
//This lines means that the user root can execute from ALL terminals, //acting as ALL (any) users, and run ALL (any) command.
Install "ssh server"
sudo yum install openssh-server
check the ssh server status to be sure it is running
/sbin/service sshd status
start sshd service if it is not started
/sbin/service sshd start
or you can simply test ssh with
ssh localhost
after succesfull test you can exit
exit
These instructions are also defined in cloudera web site. If you can check the /var/log/cloudera-scm-agent/cloudera-scm-agent-log or .out files and see that there are persistence or hibernate related
exception/errors that means problem is about postgresql database. probably database is not set yet. All we need to do is to set it up.
Not : postgresql only needed for manager(master) node. no need for slaves.
Be sure postgresql instance is installed by checking service status
/etc/init.d/postgresql status
Not : instruction below needs repo configuration!!! If you do not know how then skip to script file usage.
Install the embedded PostgreSQL database package on the Cloudera Manager Server host:
sudo yum install cloudera-manager-server-db
Prepare the embedded PostgreSQL database for use with the Cloudera Manager Server by running this command
sudo /sbin/service cloudera-scm-server-db initdb
Start the embedded PostgreSQL database by running this command:
sudo /sbin/service cloudera-scm-server-db start
Script file usage : Instruction below is manual setting of postgresql with script file
/usr/share/cmf/schema/scm_prepare_database.sh database-type [options] database-name username password
Required Parameter and Description
database-type To connect to a MySQL database, specify mysql as the database type, or specify postgresqlto connect to an external PostgreSQL database.
database-name The name of the Cloudera Manager Server database you want to create.
username The username for the Cloudera Manager Server database you want to create.
password The password for the Cloudera Manager Server database you want to create. If you don't specify the password on the command line, the script will prompt you to enter it.
You can check this page for details : https://ccp.cloudera.com/display/ENT/Installation+Path+B+-+Installation+Using+Your+Own+Method#InstallationPathB-InstallationUsingYourOwnMethod-Step5%3AConfigureaDatabasefortheClouderaManagerServer
start postgresql if it is not started (you can check the status and to be sure restart it)
/etc/init.d/postgresql start
If there is rooting/ firewall restriction on linux then heartbeath of the agent will not reach master node(manager) so we need to eliminate security
concerns. In this case there are Selinux and iptables that can create problem. Cloudera says disable iptables totally but if you are experienced
about iptables configuration then you can add rules like this.
open iptables and set rule for port access of 7180
nano /etc/sysconfig/iptables
adding this line :
-A RH-Firewall-1-INPUT -m state –state NEW -m tcp -p tcp –dport 7180 -j ACCEPT
or simply (cloudera way) disable iptables totaly. be sure it is same on all nodes
sudo /etc/init.d/iptables stop
check iptables status with status parameter
/etc/init.d/iptables status
Not : Every time machine restarts, iptables will be activated again so you may need a way to stop it automatically. Ay problem happened because of iptables and selinuxun will be in log file "cloudera-scm-agent.log". You may see some "deprecated" warnings about
phyton code, just ignore them. Error/exception are generally "no route to host " or something like that.
disable selinux. but you may need to do this before many operation above. Especially when you try to install cloudera manager. linux will give you warning about selinux.
sudo nano /etc/selinux/config
(selinux=disabled)
Set unique host name for each machine. so in each mahine edit this file and give name to that machine. we will use this name in hosts file.
sudo nano /etc/sysconfig/network
remodify host file with all ip values and hostnames of nodes. Do this in all nodes. You can simply copy to other nodes also. all hosts files will be same
sudo nano /etc/hosts
example : 127.0.0.1 localhost 192.168.1.2 masternode 192.168.1.3 namenode 192.168.1.4 datanode1 192.168.1.5 datanode2
check the cloudera manager status and if you need you can restart it
sudo /sbin/service cloudera-scm-server start
be sure your internet connection is good enough for all nodes. because manager will connect them and starts series of download operation on each of them. if manager comes across any problem it will rollback everything so this will cost you to restart each everything. Trust me this part is taking too much time!
if you using virtual machines as nodes(which is i did.) you may choose bridged network mode. so you can give internet connectivity to all nodes but this has one downside. If you restart your physical machine you may lost your ip values and retake new ones automatically. Which can couse you to remodify hosts file on each node. But if you use NAT or something other like internal network you can give static ip values to your nodes so there will not be reconfiguration need. but then you should provide internet access gateway ip for all machine. because not just manager, also agents need internet access to download files. Ofcourse when you finish seting up your cluster then you can eliminate the need of agent(slaves) node's internet access.
You should try ifconfig when you start virtual machine to see if it is getting ip value from network. If not then your virtual machine configuration on your VM application must be changed. if you are working on physical machine that has cable and wireless connectivity then you will have more than one ethernet adaptor choise. bu sure to choose right one. wrong one will not give you ip address.
Be sure to use oracle JDK.
Check cloudera scm status time to time.
sudo /sbin/service cloudera-scm-server status
check 7180 and other cloudera manager realted ports are listened. you can use "nmap" or "netstat --listen"
If you are unable to install cloudera manager to master node(probably selinux, postgresql or download problem. by the way be sure download is uncuttable) then you may need to clean up and restart.
this line will clean cloudera realted files and allow you to restart again.
sudo rm -Rf /usr/share/{cmf,hue} /var/lib/cloudera* /var/cache/yum/cloudera*
you can restart cloudera-scm-agent on slave nodes if you change anything and to besure process are working correctly.But you shold clean log files to see if new configuration is working properly. Log files are important to see what is going wrong or right.
cd /var/log/cloudera-scm-agent
sudo rm *
Next steps are adding host from cludera manager web interface :
In manager machine i used "localhost:7180" to connect to mamanger gui. in the hosts part you will se adding new host to cluster. just add the name of the node in testbox adn press the "Find Hosts" button. The name of the hosts are already defined in /etc/hosts file if you remember. So you can either use ip or hostname in the textbox, if they are set right then mamanger will find suitable one and lists them in list above. If they are not managed yet (means nothing installed on them yet), "currently managed" column will show "no". otherwise it will show "yes".
After that you can continue to install cloudera agent and hadoop files on choosen hosts. But if you already installed them(if they are managed) then you can begin to add services on them. just go to "Services" page and continue your process. If you set ups hosts correctly and see they are managed then adding service is very easy and non problematic.(at least for me).
please send any comment about my answer. it is kind a long. maybe nonneccessaryly. but i tried to add every detail.
I also had a similar problem. Cloudera Manager was able to install all the components but the hosts wasn't showing up in the managed hosts list.
In my case the ip/dns name configuration was fine. I was able to do lookups successfully. Later I realized that Cloudera needs bunch of ports to manage the nodes. Also additional ports will be needed for various Hadoop services. Just to see if the problem is because of this, you can turn off the firewall temporarily. If that's the issue, refer to Cloudera's documentation for list of ports. Currently it's located in: https://ccp.cloudera.com/display/ENT4DOC/Configuring+Ports+for+Cloudera+Manager
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With