What exactly is this keyword Context in Hadoop MapReduce world in new API terms? Its extensively used to write output pairs out of Maps and Reduce, however I am not sure if it can be used somewhere else and what's exactly happening whenever I use context. Is it a Iterator with different name? What is relation between Class Mapper.Context, Class Reducer.Context and Job.Context? Can someone please explain this starting with Layman's terms and then going in detail. Not able understand much from Hadoop API documentations. Thanks for your time and help.

Context object: allows the Mapper/Reducer to interact with the rest of the Hadoop system. It includes configuration data for the job as well as interfaces which allow it to emit output. Applications can use the Context: <ul> <li>to report progress</li> <li>to set application-level status messages</li> <li>update Counters</li> <li>indicate they are alive</li> <li>to get the values that are stored in job configuration across map/reduce phase.</li> </ul>

The new API makes extensive use of Context objects that allow the user code to communicate with MapRduce system. It unifies the role of JobConf, OutputCollector, and Reporter from old API.

What is Keyword Context in Hadoop programming world?

Tags:

hadoop

mapreduce

What exactly is this keyword Context in Hadoop MapReduce world in new API terms?

Its extensively used to write output pairs out of Maps and Reduce, however I am not sure if it can be used somewhere else and what's exactly happening whenever I use context. Is it a Iterator with different name?

What is relation between Class Mapper.Context, Class Reducer.Context and Job.Context?

Can someone please explain this starting with Layman's terms and then going in detail. Not able understand much from Hadoop API documentations.

Thanks for your time and help.

691

asked Nov 16 '14 05:11

Brijesh

2 Answers

Context object: allows the Mapper/Reducer to interact with the rest of the Hadoop system. It includes configuration data for the job as well as interfaces which allow it to emit output.

Applications can use the Context:

to report progress
to set application-level status messages
update Counters
indicate they are alive
to get the values that are stored in job configuration across map/reduce phase.

answered Oct 21 '22 01:10

SMA

The new API makes extensive use of Context objects that allow the user code to communicate with MapRduce system.

It unifies the role of JobConf, OutputCollector, and Reporter from old API.

answered Oct 21 '22 02:10

sras

Related questions
                            
                                Bypassing org.apache.hadoop.mapred.InvalidInputException: Input Pattern s3n://[...] matches 0 files
                            
                                Hive create table with inputs from nested sub-directories
                            
                                In spark join, does table order matter like in pig?
                            
                                HBase: Thrift vs Rest performance
                            
                                Get error "mismatched input 'as' expecting FROM near ')' in from clause" when run sql query Hadoop Java
                            
                                How to decide on the number of partitions required for input data size and cluster resources?
                            
                                When to prefer Hadoop MapReduce over Spark?
                            
                                Run Apache Flink with Amazon S3
                            
                                Retrieve files from remote HDFS
                            
                                Number of reducers in hadoop
                            
                                Hadoop, Hive, Pig, HBase, Cassandra - when to use what? [closed]
                            
                                Hadoop slave files configuration
                            
                                MapReduce shuffle/sort method
                            
                                Loading data with Hive, S3, EMR, and Recover Partitions
                            
                                Is there ln in hadoop HDFS
                            
                                zookeeper client does not provide CLI with "jline support is disabled" message
                            
                                HBase - What's the difference between WAL and MemStore?
                            
                                Configuring Hadoop logging to avoid too many log files
                            
                                Still getting "Unable to load realm info from SCDynamicStore" after bug fix
                            
                                Hadoop Yarn Container Does Not Allocate Enough Space

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With