I am looking for the jar files to be able to run the hadoop jobs associated with the examples and test jars. In the past they were under /usr/lib/hadoop, but apparently no longer. Pointers appreciated.
Note: this question was originally for CDH4.2. But some answers include info for later versions
The samples are located on the HDInsight cluster at /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples. jar . Source code for these samples is included on the HDInsight cluster at /usr/hdp/current/hadoop-client/src/hadoop-mapreduce-project/hadoop-mapreduce-examples . Counts the words in the input files.
For this you need to add a package name to your . java file according to the directory structure , for example home. hduser. dir and while running the hadoop jar command specify the class name with the package structure, for example home.
The most common example of mapreduce is for counting the number of times words occur in a corpus. Suppose you had a copy of the internet (I've been fortunate enough to have worked in such a situation), and you wanted a list of every word on the internet as well as how many times it occurred.
find / -name hadoop-mapreduce-examples*.jar
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.0.0-cdh4.7.0.jar
/usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar
On my single-node Hadoop 2.3.0-cdh5.0.2 setup on CentOS release 6.5 (Final) I found the mapred examples at /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-2.3.0-cdh5.0.3.jar (symlinked from /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar). Via http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Installation-Guide/cdh5ig_tips_guidelines.html .
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With