Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How To Re-encode AWS Lambda Event Encoding of S3 Key in Python 3?

Tags:

I am writing a python 3 AWS Lambda routine, that will take the S3 bucket and Key (source_key) from the Lambda event object and copy the file to another S3 bucket with the same Key value (Destination_key).

However, the S3 Key in the event object is encoded in such a manner that when I use the source_key value to write to the destination bucket S3 throws a 404 error.

Key returned by S3 Lambda Event object:

'object': {'key': 'SBN-Fwd_+USPS+-+Springdale%2C+OH+-+Mail+Processing+Facility+-+Bid+Extension+Notice.eml' 

Error when submitting 'key' value back to S3:

{'Error': {'Code': 'NoSuchKey', 'Message': 'The specified key does not exist.', 'Key': 'SBN-Fwd_+USPS+-+Springdale%2C+OH+-+Mail+Processing+Facility+-+Bid+Extension+Notice.eml'}, 'ResponseMetadata': {'RequestId': '2C0154D58032B5B4', 'HostId': 'zxp56SHdODohW5ln8B5GOW+YPqGfL4/kJGD+qV46yMhLZU92BrOC/hlh/HPHywAuGuJiICL0RFk=', 'HTTPStatusCode': 404, 'HTTPHeaders': {'x-amz-request-id': '2C0154D58032B5B4', 'x-amz-id-2': 'zxp56SHdODohW5ln8B5GOW+YPqGfL4/kJGD+qV46yMhLZU92BrOC/hlh/HPHywAuGuJiICL0RFk=', 'content-type': 'application/xml', 'transfer-encoding': 'chunked', 'date': 'Thu, 20 Sep 2018 16:40:00 GMT', 'server': 'AmazonS3'}, 'RetryAttempts': 0}}

I simply used the boto3 to copy the source_key to the destination_key while specifying a different bucket.

 copy_source = {'Bucket': source_bucket, 'Key': source_key}
 destination_key = source_key

 s3resource.copy(copy_source ,destination_bucket, destination_key)

This routine works perfectly as long as the source_key does not contain any strange characters (space, comma, etc)

How can I process the source_key to make sure that it is compatible as a destination key? I could not find any documentation on what S3 expects for encoding.

like image 722
Doug Bower Avatar asked Sep 20 '18 17:09

Doug Bower


1 Answers

S3 keys in event messages are URL encoded. From AWS documentation:

The s3 key provides information about the bucket and object involved in the event. The object key name value is URL encoded. For example, "red flower.jpg" becomes "red+flower.jpg" (Amazon S3 returns "application/x-www-form-urlencoded" as the content type in the response).

In order to re-use bucket and key correctly you need to decode them. In Python (>= 3.5) you can use unquote_plus

from urllib.parse import unquote_plus 

copy_source = {'Bucket': source_bucket, 'Key': source_key}
destination_bucket = unquote_plus(source_bucket, encoding='utf-8')
destination_key = unquote_plus(source_key, encoding='utf-8')

s3resource.copy(copy_source ,destination_bucket, destination_key)

like image 90
lspoken Avatar answered Oct 05 '22 10:10

lspoken