Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I use Hadoop with AWS4-HMAC-SHA256?

My newly created bucket uses AWS Signature Version 4. I'm trying to use it with Hadoop and getting the message:

Exception in thread "main" org.apache.hadoop.fs.s3.S3Exception:
org.jets3t.service.S3ServiceException: S3 GET failed for '/' XML Error Message:
<?xml version="1.0" encoding="UTF-8"?>
<Error>
<Code>InvalidRequest</Code>
<Message>The authorization mechanism you have provided is not supported.
         Please use AWS4-HMAC-SHA256.</Message>
</Error>

There's no mention of this on the Hadoop Amazon S3 page. Is Hadoop incompatible with S3 now or did I miss a configuration option?

I've tried adding -Dcom.amazonaws.services.s3.enableV4 as suggested on the suggested on the SDK page, no luck. I assume from that that Hadoop doesn't use the Amazon SDK.

FWIW I'm using Apache Spark, but it uses Hadoop.

EDIT: I found this Jira ticket.

like image 881
Joe Avatar asked Dec 12 '14 13:12

Joe


1 Answers

You are probably trying to get data with s3n which will not work. Switch to s3a and don't forget to include the endpoint:

hdfs dfs -Dfs.s3a.awsAccessKeyId=<access key ID> -Dfs.s3a.awsSecretAccessKey=<secret acces key> -Dfs.s3a.endpoint=<s3 enpoint> -ls s3a://<bucket_name>/...

The endpoints you can find here: http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region

like image 162
Iulia Avatar answered Sep 25 '22 06:09

Iulia