What is best way to start and stop hadoop ecosystem, with command line?

2 Answers

start-all.sh & stop-all.sh : Used to start and stop hadoop daemons all at once. Issuing it on the master machine will start/stop the daemons on all the nodes of a cluster. Deprecated as you have already noticed.

start-dfs.sh, stop-dfs.sh and start-yarn.sh, stop-yarn.sh : Same as above but start/stop HDFS and YARN daemons separately on all the nodes from the master machine. It is advisable to use these commands now over start-all.sh & stop-all.sh

hadoop-daemon.sh namenode/datanode and yarn-deamon.sh resourcemanager : To start individual daemons on an individual machine manually. You need to go to a particular node and issue these commands.

Use case : Suppose you have added a new DN to your cluster and you need to start the DN daemon only on this machine,

bin/hadoop-daemon.sh start datanode

Note : You should have ssh enabled if you want to start all the daemons on all the nodes from one machine.

Hope this answers your query.

157

answered Sep 20 '22 13:09

Tariq

From Hadoop page,

start-all.sh

This will startup a Namenode, Datanode, Jobtracker and a Tasktracker on your machine.

start-dfs.sh

This will bring up HDFS with the Namenode running on the machine you ran the command on. On such a machine you would need start-mapred.sh to separately start the job tracker

start-all.sh/stop-all.sh has to be run on the master node

You would use start-all.sh on a single node cluster (i.e. where you would have all the services on the same node.The namenode is also the datanode and is the master node).

In multi-node setup,

You will use start-all.sh on the master node and would start what is necessary on the slaves as well.

Alternatively,

Use start-dfs.sh on the node you want the Namenode to run on. This will bring up HDFS with the Namenode running on the machine you ran the command on and Datanodes on the machines listed in the slaves file.

Use start-mapred.sh on the machine you plan to run the Jobtracker on. This will bring up the Map/Reduce cluster with Jobtracker running on the machine you ran the command on and Tasktrackers running on machines listed in the slaves file.

hadoop-daemon.sh as stated by Tariq is used on each individual node. The master node will not start the services on the slaves.In a single node setup this will act same as start-all.sh.In a multi-node setup you will have to access each node (master as well as slaves) and execute on each of them.

Have a look at this start-all.sh it call config followed by dfs and mapred

answered Sep 21 '22 13:09

Suvarna Pattayil

Related questions
                            
                                How do you make a HIVE table out of JSON data?
                            
                                Download large data for Hadoop [closed]
                            
                                What is the relationship between Spark, Hadoop and Cassandra
                            
                                Cannot Read a file from HDFS using Spark
                            
                                How to choose between Cassandra, Membase, Hadoop, MongoDB, RDBMS etc.? [closed]
                            
                                How do I get schema / column names from parquet file?
                            
                                How does Hadoop perform input splits?
                            
                                Why do we need ZooKeeper in the Hadoop stack?
                            
                                Ports are not available: listen tcp 0.0.0.0/50070: bind: An attempt was made to access a socket in a way forbidden by its access permissions
                            
                                SparkSQL vs Hive on Spark - Difference and pros and cons?
                            
                                Why spark-shell fails with NullPointerException?
                            
                                Thrift, Avro, Protocolbuffers - Are they all dead?
                            
                                Setting the number of map tasks and reduce tasks
                            
                                How to get started with Big Data Analysis [closed]
                            
                                Free Large datasets to experiment with Hadoop
                            
                                Datanode process not running in Hadoop
                            
                                Datanode not starts correctly
                            
                                Cascading examples failed to compile?
                            
                                Spark on yarn concept understanding
                            
                                Cleanest way in Gradle to get the path to a jar file in the gradle dependency cache

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

What is best way to start and stop hadoop ecosystem, with command line?

Tags:

hadoop

twid

People also ask

2 Answers

Tariq

Suvarna Pattayil

Recent Activity

Donate For Us