Are setup and cleanup methods called in each mapper and reducer tasks respectively? Or are they called only once at the start of overall mapper and reducer jobs?
They are called for each task, so if you have 20 mappers running, the setup / cleanup will be called for each one.
One gotcha is the standard run method for both Mapper and Reducer does not catch exceptions around the map / reduce methods - so if an exception is thrown in these methods, the clean up method will not be called.
2020 Edit: As noted in the comments, this statement from 2012 (Hadoop 0.20) is no longer true, the cleanup is called as part of a finally block.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With