Debugging hadoop applications

1 Answers

The page @SquareCog points to is a very good source of information on debugging a MapReduce job once you are running it on a cloud.

Before you reach that point though you should consider writing unit tests for your mappers and reducers, so you can verify that the basic logic works. If you are interested in unit tests to test drive your map and reduce logic check out mrunit, which works in a similar fashion to JUnit.

199

answered Oct 06 '22 23:10

Binary Nerd

Related questions
                            
                                Summing values of Hive array types
                            
                                Hadoop MRUnit throws exception
                            
                                Calculate count of distinct values of a field using pig script
                            
                                HDFS performance for small files
                            
                                How to add SerDe jar
                            
                                Sqoop Hive exited with status 1
                            
                                Simple User/Password authentication for HiveServer2 (without Kerberos/LDAP)
                            
                                How can I check Oozie logs
                            
                                How can I reload oozie job configuration file without restart oozie job
                            
                                How to use a file in a hadoop streaming job using python?
                            
                                Namenode failure and recovery in Hadoop
                            
                                Table Partitioned by Timestamp Field
                            
                                hadoop file system change directory command
                            
                                Hive, how do I retrieve all the database's tables columns
                            
                                Sqoop - Binding to YARN queues
                            
                                How to find if a folder exists in hadoop or not?
                            
                                Cannot create directory /home/hadoop/hadoopinfra/hdfs/namenode/current
                            
                                How to assign and use column headers in Spark?
                            
                                pyspark.sql.utils.IllegalArgumentException: u'java.net.UnknownHostException: user'
                            
                                When running with master 'yarn' either HADOOP_CONF_DIR or YARN_CONF_DIR must be set in the environment

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Debugging hadoop applications

Tags:

hadoop

mapreduce

Deepak

People also ask

1 Answers

Binary Nerd

Recent Activity

Donate For Us