Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I delete old files by name in S3 bucket?

Much like in S3-Bucket/Management/Lifecycles using prefixes, I'd like to prune old files that have certain words.

I'm looking to remove files that start with Screenshot or has screencast in the filename older than 365 days.

Examples

  • /Screenshot 2017-03-19 10.11.12.png
  • folder1/Screenshot 2019-03-01 14.31.55.png
  • folder2/sub_folder/project-screencast.mp4

I'm currently testing if lifecycle prefixes work on files too.

like image 649
Chance Smith Avatar asked Mar 20 '19 13:03

Chance Smith


People also ask

How do I Delete a large number of objects in S3 bucket?

To delete objectsSign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/ . Navigate to the Amazon S3 bucket or folder that contains the objects that you want to delete. Select the check box to the left of the names of the objects that you want to delete.

How do I Delete items from my AWS S3?

To delete the object, select the object, and choose delete and confirm your choice by typing delete in the text field. On, Amazon S3 will permanently delete the object version. Select the object version that you want to delete, and choose delete and confirm your choice by typing permanently delete in the text field.

How will you Delete a bucket that has multiple files and folders in it?

To delete multiple files from an S3 Bucket with the AWS CLI, run the s3 rm command, passing in the exclude and include parameters to filter the files the command is applied to.


2 Answers

You can write a program to do it, such as this Python script:

import boto3

s3 = boto3.client('s3', region_name='ap-southeast-2')
response = s3.list_objects_v2(Bucket='my-bucket')

keys_to_delete = [{'Key': object['Key']} 
                  for object in response['Contents'] 
                  if object['LastModified'] < datetime(2018, 3, 20)
                     and ('Screenshot' in object['Key'] or 'screencast' in object['Key'])
                 ]

s3.delete_objects(Bucket='my-bucket', Delete={'Objects': keys_to_delete})

You could modify it to be "1 year ago" rather than a specific date.

like image 87
John Rotenstein Avatar answered Oct 17 '22 10:10

John Rotenstein


I don't believe that you can apply lifecycle rules with wildcards such as *screencast*, only with prefixes such as "taxes/" or "taxes/2010".

For your case, I would probably write a script (or perhaps an Athena query) to filter an S3 Inventory report for those files that match your name/age conditions, and then prune them.

Of course, you could write a program to do this as @John Rotenstein suggests. The one time that might not be ideal is if you have millions or billions of objects because the time to enumerate the list of objects would be significant. But it would be fine for a reasonable number of objects.

like image 33
jarmod Avatar answered Oct 17 '22 09:10

jarmod