Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Global variables in hadoop

My program follows a iterative map/reduce approach. And it needs to stop if certain conditions are met. Is there anyway i can set a global variable that can be distributed across all map/reduce tasks and check if the global variable reaches the condition for completion.

Something like this.

While(Condition != true){

            Configuration conf = getConf();
            Job job = new Job(conf, "Dijkstra Graph Search");

            job.setJarByClass(GraphSearch.class);
            job.setMapperClass(DijkstraMap.class);
            job.setReducerClass(DijkstraReduce.class);

            job.setOutputKeyClass(IntWritable.class);
            job.setOutputValueClass(Text.class);

}

Where condition is a global variable that is modified during/after each map/reduce execution.

like image 521
Deepak Avatar asked May 22 '10 16:05

Deepak


1 Answers

This is how it works in Hadoop 2.0

In your driver:

 conf.set("my.dijkstra.parameter", "value");

And in your Mapper:

protected void setup(Context context) throws IOException,
            InterruptedException {
        Configuration conf = context.getConfiguration();

        strProp = conf.get("my.dijkstra.parameter");
        // and then you can use it
    }
like image 127
Nilesh Avatar answered Sep 18 '22 01:09

Nilesh