Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hadoop MapReduce intermediate output

Is there a way to output to log the intermediate (Map Phase) output of a MapReduce Job without editing the Application? (The application is not mine, but the cluster is, and I can setup the Hadoop Cluster as I want to)

like image 650
alessiop86 Avatar asked Oct 23 '11 16:10

alessiop86


1 Answers

keep.task.files.pattern parameter can be used to keep the intermediate files. The intermediate files have to be manually cleaned up once the Job has been completed. Since, this is a map/reduce task property, it has to be set in the configuration file and the jar file packaged again.

like image 143
Praveen Sripati Avatar answered Oct 18 '22 12:10

Praveen Sripati