Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hadoop query regarding setJarByClass method of Job class

Tags:

In the Hadoop API documentation it's given

that

setJarByClass   public void setJarByClass(Class<?> cls)  Set the Jar by finding where a given class came from. 

What exactly does this explanation mean? does it creates a JAR file from the class file argument specified in the method above? and does that jar file is executed for the MapReduce task.?

like image 922
Tarun Sapra Avatar asked Oct 12 '10 06:10

Tarun Sapra


People also ask

What does setJarByClass do?

setJarByClass(WordCount. class); Helps to identify the Jar which contains the Mapper and Reducer by specifying a class in that Jar.

What is a job in Hadoop?

A Hadoop Map Reduce job defines, schedules, monitors, and manages the execution of Hadoop Map Reduce . jar files. You can bundle your Map Reduce code in a . jar file and run it using this job.

What is job class in Java?

Job Class. The Job class is the most important class in the MapReduce API. It allows the user to configure the job, submit it, control its execution, and query the state. The set methods only work until the job is submitted, afterwards they will throw an IllegalStateException.


1 Answers

This method sets the jar file in which each node will look for the Mapper and Reducer classes.

It does not create a jar from the given class. Rather, it identifies the jar containing the given class. And yes, that jar file is "executed" (really the Mapper and Reducer in that jar file are executed) for the MapReduce job.

(Also see Stanley Xu's answer to a similar question about the need for this method since you give the jar on the command line)

like image 161
Mark Avatar answered Oct 06 '22 01:10

Mark