Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable

Question

I am trying to run a map/reducer in java. Below are my files

WordCount.java

package counter;


public class WordCount extends Configured implements Tool {

public int run(String[] arg0) throws Exception {
    Configuration conf = new Configuration();

    Job job = new Job(conf, "wordcount");

    job.setOutputKeyClass(Text.class);
    job.setOutputValueClass(IntWritable.class);

    job.setMapperClass(WordCountMapper.class);
    job.setReducerClass(WordCountReducer.class);

    job.setInputFormatClass(TextInputFormat.class);
    job.setOutputFormatClass(TextOutputFormat.class);

    FileInputFormat.addInputPath(job, new Path("counterinput"));
    // Erase previous run output (if any)
    FileSystem.get(conf).delete(new Path("counteroutput"), true);
    FileOutputFormat.setOutputPath(job, new Path("counteroutput"));

    job.waitForCompletion(true);
    return 0;
}   

public static void main(String[] args) throws Exception {
    int res = ToolRunner.run(new Configuration(), new WordCount(), args);
    System.exit(res);

    }
}

WordCountMapper.java

public class WordCountMapper extends
Mapper<LongWritable, Text, Text, IntWritable> {
    private final static IntWritable one = new IntWritable(1);
    private Text word = new Text();

    public void map(LongWritable key, Text value, OutputCollector<Text,IntWritable> output, Reporter reporter)
    throws IOException, InterruptedException {
        System.out.println("hi");
    String line = value.toString();
    StringTokenizer tokenizer = new StringTokenizer(line);
    while (tokenizer.hasMoreTokens()) {
        word.set(tokenizer.nextToken());
        output.collect(word, one);
        }
    }
}

WordCountReducer.java

public class WordCountReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
    public void reduce(Text key, Iterator<IntWritable> values,
        OutputCollector<Text,IntWritable> output, Reporter reporter) throws IOException, InterruptedException {
        System.out.println("hello");
        int sum = 0;
        while (values.hasNext()) {
            sum += values.next().get();
        }
        output.collect(key, new IntWritable(sum));
    }
}

I am getting following error

13/06/23 23:13:25 INFO jvm.JvmMetrics: Initializing JVM Metrics with  
processName=JobTracker, sessionId=

13/06/23 23:13:25 WARN mapred.JobClient: Use GenericOptionsParser for parsing the 
arguments. Applications should implement Tool for the same.
13/06/23 23:13:26 INFO input.FileInputFormat: Total input paths to process : 1
13/06/23 23:13:26 INFO mapred.JobClient: Running job: job_local_0001
13/06/23 23:13:26 INFO input.FileInputFormat: Total input paths to process : 1
13/06/23 23:13:26 INFO mapred.MapTask: io.sort.mb = 100
13/06/23 23:13:26 INFO mapred.MapTask: data buffer = 79691776/99614720
13/06/23 23:13:26 INFO mapred.MapTask: record buffer = 262144/327680
13/06/23 23:13:26 WARN mapred.LocalJobRunner: job_local_0001
java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, 
recieved org.apache.hadoop.io.LongWritable
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:845)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:541)
at org.
apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.apache.hadoop.mapreduce.Mapper.map(Mapper.java:124)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
13/06/23 23:13:27 INFO mapred.JobClient:  map 0% reduce 0%
13/06/23 23:13:27 INFO mapred.JobClient: Job complete: job_local_0001
13/06/23 23:13:27 INFO mapred.JobClient: Counters: 0

I think it is not able to find Mapper and reducer class. I have written the code in main class, It is getting default Mapper and reducer class.

Tariq · Accepted Answer

Add these 2 lines in your code :

job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);

You are using TextOutputFormat which emits LongWritable key and Text value by default, but you are emitting Text as key and IntWritable as value. You need to tell this to the famework.

HTH

Jerry Ragland · Answer

This may not be your issue but I had this silly issue once. Make sure you are not mixing the old and the new libraries i.e. mapred vs mapreduce. Annotate @Overide on your map and reduce methods. If you see errors you are not properly overriding the methods.

Suresh Vadali · Answer

I got a similar exception stack trace due to improper Mapper Class set in my code (typo :) )

job.setMapperClass(Mapper.class)  // Set to org.apache.hadoop.mapreduce.Mapper due to type

Notice that mistakenly I was using Mapper class from mapreduce package, I changed it to my custom mapper class:

job.setMapperClass(LogProcMapperClass.class) // LogProcMapperClass is my custom mapper.

The exception is resolved after I corrected the mapper class.

ketankk · Answer

Removing this from Code solved the issue

super.map(key, value, context);

Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable

Tags:

java

hadoop

mapreduce

Neil

4 Answers

Tariq

Jerry Ragland

Suresh Vadali

ketankk

Recent Activity

Donate For Us

Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable

Tags:

java

hadoop

mapreduce

Neil

4 Answers

Tariq

Jerry Ragland

Suresh Vadali

ketankk

Related questions

Recent Activity

Donate For Us