Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

JobConf v/s Configuration for Hadoop 1.0.4

Hi I am new to Hadoop and it's FileSystem. I saw two different examples of WordCount using JobConf and Configuration. What is difference in them.

I studied that JobConf was part of old package org.apache.hadoop.mapred(that deprecated in 0.20.x) but Configuration is part of new package org.apache.hadoop.mapreduce. But now in v1.0.4 it is un-deprecated.

Currently we have two ways to run map reduce jobs in java, one is by using (extending) classes in org.apache.hadoop.mapreduce package and other is by implementing classes in org.apache.hadoop.mapred package.

I want to know:

  1. What is difference between mapred and mapreduce package structure and why mapred is un-deprecated?

  2. Which approach is better for v1.0.4 to use and why? JobConf or Configuration?

  3. Which is better for v1.0.4? mapred or mapreduce ?

like image 826
Abhendra Singh Avatar asked Feb 19 '13 14:02

Abhendra Singh


Video Answer


1 Answers

If you look in the releases page, you can see that 1.0.4 corresponds to something around 0.20.20x

To give some context, here is what was being discussed on the mailing list:

The "old" MapReduce API in org.apache.hadoop.mapred was deprecated in the 0.20 
release series when the "new" (Context Objects) MapReduce API was added in
org.apache.hadoop.mapreduce. Unfortunately, the new API was not complete in 0.20
and most users stayed with the old API. This has led to the confusing situation 
where the old API is generally recommended, even though it is deprecated.

So as you can see, it's mainly a matter of retro-compatibility.

So the bottom line is, if you are starting your application now with 1.0.4 you should use mapreduce and not mapred since it's the preferred way now, but you can still use the old mapred if you have legacy applications. Which implies you should use Configuration.

As for the difference between mapred and mapreduce, as explained in the extract above it mainly comes from the introduction of Context objects, but there are a bunch of other changes and new classes that are not available in the old mapred.

like image 143
Charles Menguy Avatar answered Sep 27 '22 23:09

Charles Menguy