While running hadoop job on pseudonode, task fails and got killed. Error : Task attempt_ fail to report status for 601 seconds
But the same program is running through Eclipse (local job).
Task : there are around 25K keywords , output will be all possible combination (two at a time) i.e around 25K * 25K entires
What can be the issue?
For some reason the task, when executed on your pseudonode, is not progressing. You can increase the setting "mapred.task.timeout" in mapred-site.xml. The default value of the same in mapred-default.xml is:
<property>
<name>mapred.task.timeout</name>
<value>600000</value>
<description>The number of milliseconds before a task will be
terminated if it neither reads an input, writes
an output, nor updates its status string.
</description>
</property>
For testing purposes/one time job or debugging the increment of the time-out value could be a good option, but as practice shows this is not a good solution for the production, you should review and optimize the code.
Hadoop Provides the reporting API for the same. If you do not report progress to the hadoop job for 10 mins that is 600 seconds ,it considers task as stucked and kills the task. API reference is Here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With