Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

S3ServiceException when using AWS RedshiftBasicEmitter

I am using the sample AWS kinesis/redshift code from GitHub. I ran the code in an EC2 instance and ran into the following exception. Note that the emitting from Kinesis to S3 actually succeeded. But the emitting from S3 to Redshift failed. As both emitters in the same program used the same credentials, I am very puzzled why only one of them failed!?

I understand most people getting “The AWS Access Key Id you provided does not exist in our records” exception probably may have issue setting up the S3 key pair properly. But it does not seem to be the case here as emitting to S3 succeeded. If the credentials do not have read access, it should throw an authorization error instead.

Please comment if you have any insight.

Mar 16, 2014 4:32:49 AM com.amazonaws.services.kinesis.connectors.s3.S3Emitter emit
INFO: Successfully emitted 31 records to S3 in s3://mybucket/495362565978733426345566872055061454326385819810529281-49536256597873342638068737503047822713441029589972287489
Mar 16, 2014 4:32:50 AM com.amazonaws.services.kinesis.connectors.redshift.RedshiftBasicEmitter executeStatement
SEVERE: org.postgresql.util.PSQLException: ERROR: S3ServiceException:The AWS Access Key Id you provided does not exist in our records.,Status 403,Error InvalidAccessKeyId,Rid 5TY6Y784TT67,ExtRid qKzklJflmmgnhtttthbce+8T0NIR/sdd4RgffTgfgfdfgdfgfffgghgdse56f,CanRetry 1
  Detail: 
  -----------------------------------------------
  error:  S3ServiceException:The AWS Access Key Id you provided does not exist in our records.,Status 403,Error InvalidAccessKeyId,Rid 5TY6Y784TT67,ExtRid qKzklJflmmgnhtttthbce+8T0NIR/sdd4RgffTgfgfdfgdfgfffgghgdse56f,CanRetry 1
  code:      8001
  context:   Listing bucket=mfpredshift prefix=49536256597873342637951299872055061454326385819810529281-49536256597873342638068737503047822713441029589972287489
  query:     3464108
  location:  s3_utility.cpp:536
  process:   padbmaster [pid=8116]
  -----------------------------------------------

Mar 16, 2014 4:32:50 AM com.amazonaws.services.kinesis.connectors.redshift.RedshiftBasicEmitter emit
SEVERE: java.io.IOException: org.postgresql.util.PSQLException: ERROR: S3ServiceException:The AWS Access Key Id you provided does not exist in our records.,Status 403,Error InvalidAccessKeyId,Rid 5TY6Y784TT67,ExtRid qKzklJflmmgnhtttthbce+8T0NIR/sdd4RgffTgfgfdfgdfgfffgghgdse56f,CanRetry 1
  Detail: 
  -----------------------------------------------
  error:  S3ServiceException:The AWS Access Key Id you provided does not exist in our records.,Status 403,Error InvalidAccessKeyId,Rid 5TY6Y784TT67,ExtRid qKzklJflmmgnhtttthbce+8T0NIR/sdd4RgffTgfgfdfgdfgfffgghgdse56f,CanRetry 1
  code:      8001
  context:   Listing bucket=mybucket prefix=495362565978733426345566872055061454326385819810529281-49536256597873342638068737503047822713441029589972287489
  query:     3464108
  location:  s3_utility.cpp:536
  process:   padbmaster [pid=8116]
  -----------------------------------------------
like image 934
user3424950 Avatar asked Mar 16 '14 05:03

user3424950


People also ask

What are the limitations of Amazon Redshift?

Amazon Redshift doesn't support tables with column-level privileges for cross-database queries. Amazon Redshift doesn't support concurrency scaling for the queries that read data from other databases. Amazon Redshift doesn't support query catalog objects on AWS Glue or federated databases.

How do I enable concurrency scaling in Redshift?

Enabling Concurrency ScalingGo to the AWS Redshift Console and click on “Workload Management” from the left-side navigation menu. Select your cluster's WLM parameter group from the subsequent pull-down menu. You should see a new column called “Concurrency Scaling Mode” next to each queue. The default is 'off'.

What features of Redshift will provide the appropriate scaling for this design?

Predictable cost, even with unpredictable workloads: Amazon Redshift allows you to scale with minimal cost impact, as each cluster earns up to one hour of free Concurrency Scaling credits per day. These free credits are sufficient for the concurrency needs of 97% of customers.


1 Answers

I encountered the same errors. I'm using IAM role to get credentials. In my case, it was solved by modify RedshiftBasicEmitter to add ;token=TOKEN to CREDENTIALS parameter (finally I created my own IEmitter).

See http://docs.aws.amazon.com/redshift/latest/dg/r_COPY.html

like image 157
kawty Avatar answered Jan 02 '23 12:01

kawty