Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the usage of Configured class in Hadoop programs?

Most of Hadoop MapReduce programs are like this:

public class MyApp extends Configured Implements Tool {
    @Override
    public int run(String[] args) throws Exception {
        Job job = new Job(getConf());
        /* process command line options */
        return job.waitForCompletion(true) ? 0 : 1;
    }
    public static void main(String[] args) throws Exception {
        int exitCode = ToolRunner.run(new MyApp(), args);
        System.exit(exitCode);
    }
}

What is the usage of Configured? As Tool and Configured both have getConf() and setConf() in common. What does it provide to our application?

like image 447
Majid Azimi Avatar asked Jan 03 '13 07:01

Majid Azimi


People also ask

What is configuration class in Hadoop?

Configurations are specified by resources. A resource contains a set of name/value pairs as XML data. Each resource is named by either a String or by a Path . If named by a String , then the classpath is examined for a file with that name.

What is a mapper and reducer in Hadoop?

The mapper processes the data and creates several small chunks of data. Reduce stage − This stage is the combination of the Shuffle stage and the Reduce stage. The Reducer's job is to process the data that comes from the mapper. After processing, it produces a new set of output, which will be stored in the HDFS.


1 Answers

Configured is an implementation class of the interface Configurable. Configured is the base class which has the implementations of getConf() and setConf().

Merely extending this base class enables the class that extends this to be configured using a Configuration and there are more than one implementations for Configuration.

When your code executes the following line,

ToolRunner.run(new MyApp(), args);

Internally it will do this

ToolRunner.run(tool.getConf(), tool, args);

In the above case tool is the MyApp class instance which is an implementation of Tool which just as you said has getConf() but it is just as an interface. The implementation is coming from Configured base class. If you avoid extending Configured class in the above code, then you will have to do the getConf() and setConf() implementations on your own.

like image 192
shazin Avatar answered Oct 20 '22 22:10

shazin