Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

MapReduceBase and Mapper deprecated

public static class Map extends MapReduceBase implements Mapper

MapReduceBase, Mapper and JobConf are deprecated in Hadoop 0.20.203.

What should we use now?

Edit 1 - for the Mapper and the MapReduceBase, I found that we just need to extends the Mapper

public static class Map extends Mapper
            <LongWritable, Text, Text, IntWritable> {
  private final static IntWritable one = new IntWritable(1);
  private Text word = new Text();

  public void map(LongWritable key, Text value, 
         OutputCollector<Text, IntWritable> output, 
         Reporter reporter) throws IOException {
    String line = value.toString();
    StringTokenizer tokenizer = new StringTokenizer(line);
    while (tokenizer.hasMoreTokens()) {
      word.set(tokenizer.nextToken());
      output.collect(word, one);
    }
  }
}

Edit 2 - For JobConf we should use configuration like this:

public static void main(String[] args) throws Exception {
        Configuration conf = new Configuration();
        Job job = new Job(conf);
        job.setMapperClass(WordCount.Map.class);
    }

Edit 3 - I found a good tutorial according to the new API : http://sonerbalkir.blogspot.com/2010/01/new-hadoop-api-020x.html

like image 437
JohnJohnGa Avatar asked Oct 02 '11 11:10

JohnJohnGa


2 Answers

Javadoc contains info what to use instaed of this depraceated classes:

e.g. http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/JobConf.html

 Deprecated. Use Configuration instead

Edit: When you use maven and open class declaration (F3) maven can automatically download source code and you'll see content of javadoc comments with explanations.

like image 127
mmatloka Avatar answered Oct 01 '22 00:10

mmatloka


There is not much different functionality wise between the old and the new API, except that the old API supports push to the map/reduce functions, while the new API supports both push and pull API. Although, the new API is much cleaner and easy to evolve.

Here is the JIRA for the introduction of the new API. Also, the old API has been un-deprecated in 0.21 and will be deprecated in release 0.22 or 0.23.

You can find more information about the new API or sometimes called the 'context objects' here and here.

like image 40
Praveen Sripati Avatar answered Sep 30 '22 23:09

Praveen Sripati