I am trying to limit the number of lines each of the Mappers gets. My code goes like this:
package com.iathao.mapreduce;
import java.io.IOException;
import java.net.MalformedURLException;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapred.lib.NLineInputFormat;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.regexp.RESyntaxException;
import com.gargoylesoftware.htmlunit.FailingHttpStatusCodeException;
public class Main {
public static void main(String[] args) throws FailingHttpStatusCodeException, MalformedURLException, IOException, RESyntaxException {
try {
if (args.length != 2) {
System.err.println("Usage: NewMaxTemperature <input path> <output path>");
System.exit(-1);
}
Job job = new Job();
job.setJarByClass(Main.class);
job.getConfiguration().set("mapred.max.map.failures.percent", "100");
// job.getConfiguration().set("mapred.map.max.attempts", "10");
//NLineInputFormat. .setNumLinesPerSplit(job, 1);
job.setInputFormatClass(NLineInputFormat.class);
At the last line in the sample (job.setInputFormatClass(NLineInputFormat.class);) I get following error:
The method setInputFormatClass(Class<? extends InputFormat>) in the type Job is not applicable for the arguments (Class<NLineInputFormat>)
Did I somehow get the wrong NLineInputFormat class?
You are mixing the old and the new API.
import org.apache.hadoop.mapred.lib.NLineInputFormat;
import org.apache.hadoop.mapreduce.Job;
According to the "Hadoop : The Definitive Guide"
The new API is in the org.apache.hadoop.mapreduce package (and subpackages). The old API can still be found in org.apache.hadoop.mapred.
If you plan to use the new API, then use the NLineInputFormat from the org.apache.hadoop.mapreduce package.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With