Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What are Hadoop best practices for handling exceptions in Mapper or Reducer?

Want to understand best practices for handling the exceptions in Mapper / Reducer.

Option 1: Not to have any try/catch and let the task fail and MR will retry the task which eventually terminate the job. Property mapreduce.map/reduce.maxattempts plays role here.

Option 2: Use counters to record number of failures in catch block. And based on some threshold value of these errors either kill the job or just use the counters to show number of failed records.

Any (other) common/standard practices for handling exceptions in map-reduce?

like image 933
SurjanSRawat Avatar asked Oct 31 '22 06:10

SurjanSRawat


1 Answers

Options 1 and 2 listed are some of ways we are handling in our project. Please have a look at here. It lists few more options

like image 118
Rakesh Shah Avatar answered Nov 15 '22 09:11

Rakesh Shah