I need to access the counters from my mapper in my reducer. Is this possible? If so how is it done? As an example: my mapper is: <pre class="prettyprint"><code>public class CounterMapper extends Mapper<Text,Text,Text,Text> { static enum TestCounters { TEST } @Override protected void map(Text key, Text value, Context context) throws IOException, InterruptedException { context.getCounter(TestCounters.TEST).increment(1); context.write(key, value); } } </code></pre> My reducer is <pre class="prettyprint"><code>public class CounterReducer extends Reducer<Text,Text,Text,LongWritable> { @Override protected void reduce(Text key, Iterable<Text> values, Context context) throws IOException, InterruptedException { Counter counter = context.getCounter(CounterMapper.TestCounters.TEST); long counterValue = counter.getValue(); context.write(key, new LongWritable(counterValue)); } } </code></pre> counterValue is always 0. Am I doing something wrong or is this just not possible?

The whole point of map/reduce is to parallelize the jobs. There will be many unique mappers/reducers so the value wouldn't be correct anyway except for that run of the map/reduce pair. They have a word count example: http://wiki.apache.org/hadoop/WordCount You could change the context.write(word,one) to context.write(line,one)

Accessing a mapper's counter from a reducer

Tags:

java

hadoop

I need to access the counters from my mapper in my reducer. Is this possible? If so how is it done?

As an example: my mapper is:

public class CounterMapper extends Mapper<Text,Text,Text,Text> {

    static enum TestCounters { TEST }

    @Override
    protected void map(Text key, Text value, Context context)
                    throws IOException, InterruptedException {
        context.getCounter(TestCounters.TEST).increment(1);
        context.write(key, value);
    }
}

My reducer is

public class CounterReducer extends Reducer<Text,Text,Text,LongWritable> {

    @Override
    protected void reduce(Text key, Iterable<Text> values, Context context)
                        throws IOException, InterruptedException {
        Counter counter = context.getCounter(CounterMapper.TestCounters.TEST);
        long counterValue = counter.getValue();
        context.write(key, new LongWritable(counterValue));
    }
}

counterValue is always 0. Am I doing something wrong or is this just not possible?

502

asked Mar 27 '11 15:03

asdf

4 Answers

Implemented Jeff G's solution on the new API:

    @Override
    public void setup(Context context) throws IOException, InterruptedException{
        Configuration conf = context.getConfiguration();
        Cluster cluster = new Cluster(conf);
        Job currentJob = cluster.getJob(context.getJobID());
        mapperCounter = currentJob.getCounters().findCounter(COUNTER_NAME).getValue();  
    }

126

answered Nov 15 '22 01:11

itzhaki

In the Reducer's configure(JobConf), you can use the JobConf object to look up the reducer's own job id. With that, your reducer can create its own JobClient -- i.e. a connection to the jobtracker -- and query the counters for this job (or any job for that matter).

// in the Reducer class...
private long mapperCounter;

@Override
public void configure(JobConf conf) {
    JobClient client = new JobClient(conf);
    RunningJob parentJob = 
        client.getJob(JobID.forName( conf.get("mapred.job.id") ));
    mapperCounter = parentJob.getCounters().getCounter(MAP_COUNTER_NAME);
}

Now you can use mapperCounter inside the reduce() method itself.

You actually need a try-catch here. I'm using the old API, but it shouldn't be hard to adapt for the new API.

Note that mappers' counters should all be finalized before any reducer starts, so contrary to Justin Thomas's comment, I believe you should get accurate values (as long as the reducers aren't incrementing the same counter!)

answered Nov 14 '22 23:11

Jeff G

The whole point of map/reduce is to parallelize the jobs. There will be many unique mappers/reducers so the value wouldn't be correct anyway except for that run of the map/reduce pair.

They have a word count example:

http://wiki.apache.org/hadoop/WordCount

You could change the context.write(word,one) to context.write(line,one)

answered Nov 15 '22 01:11

Justin Thomas

The global counter values are never broadcast back to each mapper or reducer. If you want the # of mapper records to be available to the reducer, you'll need to rely on some external mechanism to do this.

answered Nov 15 '22 01:11

bajafresh4life

Related questions
                            
                                Lambda performance improvement, Java 8 vs 11
                            
                                Java Comparator.comparing not comparing?
                            
                                change java version in macOS BigSur
                            
                                No virtual method verifyPhoneNumber, FATAL EXCEPTION: main
                            
                                How to avoid, that URL.equals needs access to the internet in Java?
                            
                                Why was constness removed from Java and C#?
                            
                                Benefits/drawbacks to running 64-bit JVM on 64-bit Linux server?
                            
                                How can I suppress warnings (codebase-wide) during javadoc compilation?
                            
                                Java Text File Encoding
                            
                                How to change icon of a JLabel?
                            
                                print arraylist element?
                            
                                listFiles() of File not working on symbolic links?
                            
                                Understanding the concept of inheritance in Java
                            
                                Java SAX parser progress monitoring
                            
                                Using the client jar in EJB 3 and design patterns
                            
                                Source and Javadoc jar generation
                            
                                How expensive is calling size() on List or Map in Java?
                            
                                Where should I put logging.properties file for java.util.logging in web application (maven project)?
                            
                                How To Instantiate a java.util.ArrayList with Generic Class Using Reflection
                            
                                Java - calling static methods vs manual inlining - performance overhead

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Accessing a mapper's counter from a reducer

Tags:

java

hadoop

asdf

People also ask

4 Answers

itzhaki

Jeff G

Justin Thomas

bajafresh4life

Recent Activity

Donate For Us