Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Jar file for MapReduce new API Job.getInstance(Configuration, String)

Tags:

java

hadoop

Have setup Hadoop 2.2 .Trying to remove the deprecated API

    Job job = new Job(conf, "word count");

from example Wordcount (which comes with Hadoop ) here

Replaced the deprecated API with

EDIT:

    Job job = Job.getInstance(conf, "word count");

compile error is

Job.getInstance cannot be resolved to a type.

The Job class which is already imported(old API or MR1) seems doesn't have this method.

Which jar contains this new Job class with Job.getInstance(Configuratio,String) method

How to resolve this? Are there any additional changes to the example to migrate to MapReduce v2?

like image 287
sio2deep Avatar asked Jan 22 '14 06:01

sio2deep


2 Answers

How I solved this issue was by adding hadoop-core as dependency. I had specified only hadoop-common.

<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-core</artifactId>
    <version>1.2.1</version>
</dependency>
<dependency>
    <groupId>org.apache.hadoop</groupId>
    <artifactId>hadoop-common</artifactId>
    <version>2.7.2</version>
</dependency>

like image 100
Jonathan Morales Vélez Avatar answered Nov 15 '22 07:11

Jonathan Morales Vélez


Job.getInstance cannot be resolved to a type.

You get the error message because, required libraries are not present on your application class-path. You need hadoop-core*.jar file present on your class-path to resolve this issue.

By the way which jar contains this new Job class with Job.getInstance(Configuratio,String) method

The org.apache.hadoop.mapreduce.Job class contained within hadoop-core-*.jar file. The jar file name will be appended by the hadoop version and vendor name (cdh - Cloudera, hdf - hortenworks etc.)

Suggestion:

Job.getInstance() is a static API, and you need not create an instance of the Job class to access it. Interestingly, getInstance() is used to create a new instance of the Job class, and if you already have one created using new keyword, you are not required to call getInstance again.

Replace Job job = new Job.getInstance(conf, "word count"); with Job job = Job.getInstance(conf, "word count");

like image 23
Ankur Shanbhag Avatar answered Nov 15 '22 07:11

Ankur Shanbhag