I'm practicing a video tutorial from plural sight about Amazon EMR. I am stuck as i cannot proceed as i am getting this error <code>Not a valid JAR: /home/hadoop/contrib/streaming/hadoop-streaming.jar</code> Please note that tutorial is old and it is using a older Emr version. I am using the latest version is that a problem ? The steps that i took are after entering the credentials in putty <blockquote> 1) Hadoop 2) mkdir streamingCode` 3) wget -o ./streamingCode/wordSplitter.py s3://elasticmapreduce/samples/wordcount/wordSplitter.py 4) hadoop jar contrib/streaming/hadoop-streaming.jar -files streamingCode/wordSplitter.py -mapper wordSplitter.py input s3://elasticmapreduce/samples/wordcount/input -output streamingCode/wordCountOut -reducer aggregate` </blockquote> I cannot execute step 4 as i am getting the below error <code>Not a valid JAR: /home/hadoop/contrib/streaming/hadoop-streaming.jar</code>

The Hadoop streaming jar is still available in the latest release of EMR Hadoop. Starting with EMR release 4.0.0 it can be found at <code>/usr/lib/hadoop-mapreduce/hadoop-streaming.jar</code>. Another good resource for differences between versions can be found at http://docs.aws.amazon.com/ElasticMapReduce/latest/ReleaseGuide/emr-release-differences.html.

For the variable, HADOOP_STREAMING, obtaining the path is a bit more complicated depending on the HDP you are using. Search for where it is located via command: find / -name 'hadoop-streaming*.jar' Src: http://thecoatlessprofessor.com/programming/installing-r-studio-server-on-hortonworks-virtual-box-image-and-rmr2-a-k-a-rhadoop-r-package/

how to find JAR: /home/hadoop/contrib/streaming/hadoop-streaming.jar

Tags:

java

python

amazon-web-services

hadoop

emr

I'm practicing a video tutorial from plural sight about Amazon EMR. I am stuck as i cannot proceed as i am getting this error

Not a valid JAR: /home/hadoop/contrib/streaming/hadoop-streaming.jar

Please note that tutorial is old and it is using a older Emr version. I am using the latest version is that a problem ?

The steps that i took are after entering the credentials in putty

1) Hadoop

2) mkdir streamingCode`

3) wget -o ./streamingCode/wordSplitter.py s3://elasticmapreduce/samples/wordcount/wordSplitter.py

4) hadoop jar contrib/streaming/hadoop-streaming.jar -files streamingCode/wordSplitter.py -mapper wordSplitter.py input s3://elasticmapreduce/samples/wordcount/input -output streamingCode/wordCountOut -reducer aggregate`

I cannot execute step 4 as i am getting the below error

Not a valid JAR: /home/hadoop/contrib/streaming/hadoop-streaming.jar

805

asked Sep 12 '15 21:09

harshil bhatt

2 Answers

The Hadoop streaming jar is still available in the latest release of EMR Hadoop. Starting with EMR release 4.0.0 it can be found at /usr/lib/hadoop-mapreduce/hadoop-streaming.jar.

Another good resource for differences between versions can be found at http://docs.aws.amazon.com/ElasticMapReduce/latest/ReleaseGuide/emr-release-differences.html.

181

answered Sep 21 '22 12:09

ChristopherB

For the variable, HADOOP_STREAMING, obtaining the path is a bit more complicated depending on the HDP you are using.

Search for where it is located via command: find / -name 'hadoop-streaming*.jar'

Src: http://thecoatlessprofessor.com/programming/installing-r-studio-server-on-hortonworks-virtual-box-image-and-rmr2-a-k-a-rhadoop-r-package/

answered Sep 18 '22 12:09

Nikhil B Agarwal

Related questions
                            
                                What's difference between hibernate caching and Spring framework cache?
                            
                                Is there an AspectJ pointcut expression that searches all subpackages?
                            
                                Can I use Hibernate with JTA?
                            
                                Java: Simple HTTP Server application that responds in JSON
                            
                                How does NDK work in Android - What is the order that NDK, JNI etc are used?
                            
                                IntelliJ Idea generated source
                            
                                What is the purpose of List<?> if one can only insert a null value?
                            
                                How to fix "ssl_error_no_cypher_overlap" on a Tomcat 7 server?
                            
                                Branch prediction in a java for loop
                            
                                How to check whether file is gzip or not in Java
                            
                                How to fill a form with Jsoup?
                            
                                Why proguard does not obfuscate method body?
                            
                                Java- Best practice for getters, single getter or multiple for different variables?
                            
                                In java network programming, is there a way to keep the Server side open even when the Client side shuts down?
                            
                                Why changes in sublist are reflected in the original list?
                            
                                NumberFormat text field without commas
                            
                                Java 8 - best way converting array elements
                            
                                Randomly select a node from a Binary Tree
                            
                                Any Method which does the function opposite to that of retain all?
                            
                                Writing to Lucene index, one document at a time, slows down over time

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With