Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between hadoop job -kill job_id and yarn application -kill application_id

Tags:

hadoop

hive

What is the difference between hadoop job -kill job_id and yarn application -kill application_id? Whether the job_id and application_id represent/refer to same task?

like image 836
xiaojie.wu Avatar asked May 19 '15 03:05

xiaojie.wu


People also ask

What is the difference between HDFS and YARN?

YARN is a generic job scheduling framework and HDFS is a storage framework. YARN in a nut shell has a master(Resource Manager) and workers(Node manager), The resource manager creates containers on workers to execute MapReduce jobs, spark jobs etc.

What is the role of application Manager in YARN?

The ApplicationMaster is, in effect, an instance of a framework-specific library and is responsible for negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the containers and their resource consumption.

What is the difference between YARN and spark?

Spark on YARN Typically, Spark would be run with HDFS for storage, and with either YARN (Yet Another Resource Manager) or Mesos, two of the most common resource managers. Unlike Mesos which is an OS-level scheduler, YARN is an application-level scheduler.

How Hadoop runs a MapReduce job using YARN?

Yarn node manager: In a cluster, it monitors and launches the compute containers on machines. Yarn resource manager: Handles the allocation of computing resources coordination on the cluster. MapReduce application master Facilitates the tasks running the MapReduce work.


2 Answers

hadoop job -kill job_id and yarn application -kill application_id both command is used to kill a job running on Hadoop.

If you are using MapReduce Version1(MR V1) and you want to kill a job running on Hadoop, then you can use hadoop job -kill job_id to kill a job and it will kill all jobs( both running and queued).

In MapReduce Version2(MR V2 or YARN) when you submit a MapReduce job, It process through a application master and hence the job called application.There could be multiple task running within a application. If you want to kill a application then you can use yarn application -kill application_id command to kill the application. It will kill all running and queued jobs under the application.

If you want to kill a task in YARN then you can use hadoop job -kill-task <task-id> to kill a particular task in YARN

This link will be useful to understand application and job in YARN.

like image 156
Sandeep Singh Avatar answered Oct 26 '22 19:10

Sandeep Singh


Application_id is the ID associated with Application master. Both IDs are one and the same(will have same ID value) except for the prefixes application_ and job_ before the ID.

Both represent the same job only!!

like image 39
Partha Kaushik Avatar answered Oct 26 '22 17:10

Partha Kaushik