Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Running EMR example, getting 301 Error

I am trying to run example hadoop-streaming command:

hadoop-streaming -files streamingCode/wordSplitter.py \
-mapper wordSplitter.py \
-input s3://elasticmapreduce/samples/wordcount/input \
-output streamingCode/wordCountOut \
-reducer aggregate

but I keep getting this error:

Exception in thread "main" com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3Exception: Moved Permanently (Service: Amazon S3; Status Code: 301; Error Code: 301 Moved Permanently; Request ID: 98038E504E150CEC), S3 Extended Request ID: IW1x5otBSepAnPgW/RKELCUI9dhADQvrXqU2Ase1CLIa0SWDFnBbTscXihrvHvNm2ZRxjjSJZ1Q=

I think that it is because my cluster is in us-west-2, but i can't figure out how to properly format the s3 url (or perhaps that is not the issue at all).

Edit: After changing it to the following url:

s3://s3-us-west-2.amazonaws.com/elasticmapreduce/samples/wordcount/input

I am now getting following error:

Exception in thread "main" com.amazon.ws.emr.hadoop.fs.shaded.com.amazonaws.services.s3.model.AmazonS3
Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: BC8DB415C780DF84), 
           S3 Extended Request ID: sx8W/+gvND2ssqQce9ZQsZTiqxmSJYZs8OiXgrjwL3dm0JRPaC7ceHor+yrHsPuKTjM2LUwkRAw=

Edit: So I have confirmed that the error is indeed because my cluster is in us-west-2, I have created a cluster in us-east-1 and it works properly. So, the question is how do I access a s3 bucket from another region? Is this even possible?

like image 525
b_pcakes Avatar asked Nov 09 '22 11:11

b_pcakes


1 Answers

Amazon changed the default behavior starting emr-4.7.0 which caused this error when we upgraded EMR versions.

Solution is simple, add this configuration to core-site: fs.s3n.endpoint=s3.amazonaws.com

like image 168
harel Avatar answered Nov 14 '22 22:11

harel