Want to understand best practices for handling the exceptions in Mapper / Reducer.
Option 1: Not to have any try/catch and let the task fail and MR will retry the task which eventually terminate the job. Property mapreduce.map/reduce.maxattempts plays role here.
Option 2: Use counters to record number of failures in catch block. And based on some threshold value of these errors either kill the job or just use the counters to show number of failed records.
Any (other) common/standard practices for handling exceptions in map-reduce?
Options 1 and 2 listed are some of ways we are handling in our project. Please have a look at here. It lists few more options
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With