I'm attempting to do something simple in Hadoop and found that when writing mappers and reducers are defined as static everywhere. My task is going to be decomposed into several <code>map</code> parts and one final <code>reduce</code>. What if I'd like to reuse one of my mappers in other job? If I have my mapper class defined as inner <code>static</code> one can I use it in other job? Also non-trivial problems may require many more and complicated mappers, so putting them all in one giant file gets terrible when maintaining. Is there any way to have mappers and reducers as a regular classes (possibly even in separate jar) than the job itself?

Is your question whether the class has to be static, may be static, or may be inner, or should be inner? Hadoop itself needs to be able to instantiate your <code>Mapper</code> or <code>Reducer</code> by reflection, given the class reference/name configured in your <code>Job</code>. This will fail if it is a non-static inner class since an instance can be created only in the context of some other of your classes which presumably Hadoop knows nothing about. (Unless the inner class extends its enclosing class, I suppose.) So to answer the first question: it should not be non-static, since this almost surely makes it unusable. To answer the second and third: and it can be a static (inner) class. To me a <code>Mapper</code> or <code>Reducer</code> is plainly a top-level concept and deserves a top-level class. Some like to make them inner static to pair them with a "Runner" class. I don't like this as it is really what subpackages are for. You note another design reason to avoid this. To the fourth question: no, I believe inner classes are not good practice. Final question: yes the <code>Mapper</code> and <code>Reducer</code> classes can be in a separate JAR file. You tell Hadoop which JAR files contains all of this code, and that's the one it will ship off to workers. The workers don't need your <code>Job</code>. However they need anything that the <code>Mapper</code> and <code>Reducer</code> depends on in their same JAR.

Do Mappers and Reducers in Hadoop have to be static classes?

Tags:

java

hadoop

mapreduce

I'm attempting to do something simple in Hadoop and found that when writing mappers and reducers are defined as static everywhere. My task is going to be decomposed into several map parts and one final reduce. What if I'd like to reuse one of my mappers in other job? If I have my mapper class defined as inner static one can I use it in other job? Also non-trivial problems may require many more and complicated mappers, so putting them all in one giant file gets terrible when maintaining.

Is there any way to have mappers and reducers as a regular classes (possibly even in separate jar) than the job itself?

777

asked Feb 12 '13 08:02

grafthez

2 Answers

Is your question whether the class has to be static, may be static, or may be inner, or should be inner?

Hadoop itself needs to be able to instantiate your Mapper or Reducer by reflection, given the class reference/name configured in your Job. This will fail if it is a non-static inner class since an instance can be created only in the context of some other of your classes which presumably Hadoop knows nothing about. (Unless the inner class extends its enclosing class, I suppose.)

So to answer the first question: it should not be non-static, since this almost surely makes it unusable. To answer the second and third: and it can be a static (inner) class.

To me a Mapper or Reducer is plainly a top-level concept and deserves a top-level class. Some like to make them inner static to pair them with a "Runner" class. I don't like this as it is really what subpackages are for. You note another design reason to avoid this. To the fourth question: no, I believe inner classes are not good practice.

Final question: yes the Mapper and Reducer classes can be in a separate JAR file. You tell Hadoop which JAR files contains all of this code, and that's the one it will ship off to workers. The workers don't need your Job. However they need anything that the Mapper and Reducer depends on in their same JAR.

182

answered Nov 14 '22 23:11

Sean Owen

I feel the above answer is much precise and does satisfy the rationale. Except, I feel that inner classes should be harnessed while creating the map and reduce. IMO, all the code should be at one place.

And generics can be utilised thoughtfully in the single class ensuring there are no typecasting errors.

answered Nov 15 '22 00:11

Abhay Dandekar

Related questions
                            
                                How can I use CSS for Vaadin components?
                            
                                Force InetAddress.getHostAddress() to return IPv4 address
                            
                                JTable correct row number after filtering
                            
                                Closing Jsoup Connection
                            
                                Convert object[][] to ArrayList<Object[]>
                            
                                How to use ActionListener on a ComboBox to give a variable a value
                            
                                Java Swing Components does not have correct Size when added to JPanel
                            
                                Out of memory error while parsing a large JSON using Jackson library on Android
                            
                                GoogleMap inside of a Fragment in a ViewPager, keep all touch events in the GoogleMap
                            
                                Draw an Arc and gradient it
                            
                                java websphere MQ
                            
                                Evaluate Tree Expression via Visitor
                            
                                Why does a String-casting loop seem to have a static overhead?
                            
                                Finding whether a string meets a certain pattern
                            
                                Swing - Is it possible to set the font color of 'specific' text within a JTable cell?
                            
                                Java - H2 Database - getGeneratedKeys() not returning any results
                            
                                Using maven and jenkins, how to test the programmers did some test cases?
                            
                                Attributes in JSP request become 0
                            
                                Submit button to submit form and go to next screen using jquery form-wizard
                            
                                Is it possible to use javax.interceptor in a Java SE environment?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With