Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Found interface org.apache.hadoop.mapreduce.TaskAttemptContext

Haven't seen a solution to my particular problem so far. It isn't working at least. Its driving me pretty crazy. This particular combo doesn't seem to have a lot in the google space. My error occurs as the job does into the mapper from what I can tell. The input to this job are avro schema'd output that is compressed with deflate though I tried uncompressed as well.

Avro: 1.7.7 Hadoop: 2.4.1

I am getting this error and I'm not sure why. Here is my job, mapper, and reduce. The error is happening when the mapper comes in.

Sample uncompressed Avro input file (StockReport.SCHEMA is defined this way)

{"day": 3, "month": 2, "year": 1986, "stocks": [{"symbol": "AAME", "timestamp": 507833213000, "dividend": 10.59}]}

Job

@Override
public int run(String[] strings) throws Exception {
    Job job = Job.getInstance();
    job.setJobName("GenerateGraphsJob");
    job.setJarByClass(GenerateGraphsJob.class);

    configureJob(job);

    int resultCode = job.waitForCompletion(true) ? 0 : 1;

    return resultCode;
}

private void configureJob(Job job) throws IOException {
    try {
        Configuration config = getConf();
        Path inputPath = ConfigHelper.getChartInputPath(config);
        Path outputPath = ConfigHelper.getChartOutputPath(config);

        job.setInputFormatClass(AvroKeyInputFormat.class);
        AvroKeyInputFormat.addInputPath(job, inputPath);
        AvroJob.setInputKeySchema(job, StockReport.SCHEMA$);


        job.setMapperClass(StockAverageMapper.class);
        job.setCombinerClass(StockAverageCombiner.class);
        job.setReducerClass(StockAverageReducer.class);

        FileOutputFormat.setOutputPath(job, outputPath);

    } catch (IOException | ClassCastException e) {
        LOG.error("An job error has occurred.", e);
    }
}

Mapper:

public class StockAverageMapper extends
        Mapper<AvroKey<StockReport>, NullWritable, StockYearSymbolKey, StockReport> {
    private static Logger LOG = LoggerFactory.getLogger(StockAverageMapper.class);

private final StockReport stockReport = new StockReport();
private final StockYearSymbolKey stockKey = new StockYearSymbolKey();

@Override
protected void map(AvroKey<StockReport> inKey, NullWritable ignore, Context context)
        throws IOException, InterruptedException {
    try {
        StockReport inKeyDatum = inKey.datum();
        for (Stock stock : inKeyDatum.getStocks()) {
            updateKey(inKeyDatum, stock);
            updateValue(inKeyDatum, stock);
            context.write(stockKey, stockReport);
        }
    } catch (Exception ex) {
        LOG.debug(ex.toString());
    }
}

Schema for map output key:

    {
  "namespace": "avro.model",
  "type": "record",
  "name": "StockYearSymbolKey",
  "fields": [
    {
      "name": "year",
      "type": "int"
    },
    {
      "name": "symbol",
      "type": "string"
    }
  ]
}

Stack trace:

java.lang.Exception: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
    at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.TaskAttemptContext, but class was expected
    at org.apache.avro.mapreduce.AvroKeyInputFormat.createRecordReader(AvroKeyInputFormat.java:47)
    at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.<init>(MapTask.java:492)
    at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:735)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)

Edit: Not that it matters but I'm working to reduce this to data I can create JFreeChart outputs from. Not getting through the mapper so that shouldn't be related.

like image 274
Rig Avatar asked Apr 04 '15 15:04

Rig


1 Answers

The problem is that org.apache.hadoop.mapreduce.TaskAttemptContext was a class in Hadoop 1 but became an interface in Hadoop 2.

This is one of the reasons why libraries which depend on the Hadoop libs need to have separately compiled jarfiles for Hadoop 1 and Hadoop 2. Based on your stack trace, it appears that somehow you got a Hadoop1-compiled Avro jarfile, despite running with Hadoop 2.4.1.

The download mirrors for Avro provide nice separate downloadables for avro-mapred-1.7.7-hadoop1.jar vs avro-mapred-1.7.7-hadoop2.jar.

like image 72
Dennis Huo Avatar answered Sep 20 '22 23:09

Dennis Huo