Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Could we use AWS Glue just copy a file from one S3 folder to another S3 folder?

I need to copy a zipped file from one AWS S3 folder to another and would like to make that a scheduled AWS Glue job. I cannot find an example for such a simple task. Please help if you know the answer. May be the answer is in AWS Lambda, or other AWS tools.

Thank you very much!

like image 650
Jie Avatar asked Dec 14 '22 19:12

Jie


1 Answers

You can do this, and there may be a reason to use AWS Glue: if you have chained Glue jobs and glue_job_#2 is triggered on the successful completion of glue_job_#1.

The simple Python script below moves a file from one S3 folder (source) to another folder (target) using the boto3 library, and optionally deletes the original copy in source directory.

import boto3

bucketname = "my-unique-bucket-name"
s3 = boto3.resource('s3')
my_bucket = s3.Bucket(bucketname)
source = "path/to/folder1"
target = "path/to/folder2"

for obj in my_bucket.objects.filter(Prefix=source):
    source_filename = (obj.key).split('/')[-1]
    copy_source = {
        'Bucket': bucketname,
        'Key': obj.key
    }
    target_filename = "{}/{}".format(target, source_filename)
    s3.meta.client.copy(copy_source, bucketname, target_filename)
    # Uncomment the line below if you wish the delete the original source file
    # s3.Object(bucketname, obj.key).delete()

Reference: Boto3 Docs on S3 Client Copy

Note: I would use f-strings for generating the target_filename, but f-strings are only supported in >= Python3.6 and I believe the default AWS Glue Python interpreter is still 2.7.

Reference: PEP on f-strings

like image 98
tatlar Avatar answered May 16 '23 06:05

tatlar