I came across MultithreadedMapper class in the new Hadoop version,and the documentation says that it can be used instead of the conventional (single-threaded) mapper class. But I didn't come across any demo example for using this new class. Also, I would be happier to use setNumberOfThreads() method. Any code example for using this?
Thanks in advance
small code snippet for you:
Configuration conf = new Configuration();
Job job = new Job(conf);
job.setMapperClass(MultithreadedMapper.class);
conf.set("mapred.map.multithreadedrunner.class", WebGraphMapper.class.getCanonicalName());
conf.set("mapred.map.multithreadedrunner.threads", "8");
job.setJarByClass(WebGraphMapper.class);
// rest ommitted
job.waitForCompletion(true);
I think it is pretty self-explaining. You are using the multithreaded mapper as the main class and then configure which class (your real mapper) it has to run. There are also these convenience static methods which does this configuration stuff for you. A call could look like this:
MultithreadedMapper.setMapperClass(job, WebGraphMapper.class);
MultithreadedMapper.setNumberOfThreads(job, 8);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With