Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How To Delete S3 Files Starting With

Let's say I have images of different sizes on S3:

137ff24f-02c9-4656-9d77-5e761d76a273.webp
137ff24f-02c9-4656-9d77-5e761d76a273_500_300.webp
137ff24f-02c9-4656-9d77-5e761d76a273_400_280.webp

I am using boto to delete a single file:

bucket = get_s3_bucket()
s3_key = Key(bucket)
s3_key.key = '137ff24f-02c9-4656-9d77-5e761d76a273.webp'
bucket.delete_key(s3_key)

But I would like to delete all keys starting with 137ff24f-02c9-4656-9d77-5e761d76a273.

Keep in mind there might be hundreds of files in the bucket so I don't want to iterate over all files. Is there a way to delete only files starting with certain string?

Maybe some regex delete function.

like image 650
Richard Knop Avatar asked Feb 19 '14 13:02

Richard Knop


People also ask

How do I delete files from my AWS S3?

If you no longer need to store the file you've uploaded to your Amazon S3 bucket, you can delete it. Within your S3 bucket, select the file that you want to delete, choose Actions, and then choose Delete. In the confirmation message, choose OK.

What is the best way to delete multiple objects from S3?

Navigate to the Amazon S3 bucket or folder that contains the objects that you want to delete. Select the check box to the left of the names of the objects that you want to delete. Choose Actions and choose Delete from the list of options that appears. Alternatively, choose Delete from the options in the upper right.

What is prefix and key in S3?

A key prefix is a string of characters that can be the complete path in front of the object name (including the bucket name). For example, if an object (123. txt) is stored as BucketName/Project/WordFiles/123. txt, the prefix might be “BucketName/Project/WordFiles/123.

How do I delete a non empty S3 bucket?

If your bucket does not have versioning enabled, you can use the rb (remove bucket) AWS CLI command with the --force parameter to delete the bucket and all the objects in it. This command deletes all objects first and then deletes the bucket.


2 Answers

The S3 service does support a multi-delete operation allowing you to delete up to 1000 objects in a single API call. However, this API call doesn't provide support for server-side filtering of the keys. You have to provide the list of keys you want to delete.

You could roll your own. First, you would want to get a list of all the keys you want to delete.

import boto

s3 = boto.connect_s3()
bucket = s3.get_bucket('mybucket')
to_delete = list(bucket.list(prefix='137ff24f-02c9-4656-9d77-5e761d76a273'))

The list call returns a generator but I'm converting that to a list using list so, the to_delete variable now points to list of all of the objects in the bucket that match the prefix I have provided.

Now, we need to create chunks of up to 1000 objects from the big list and use the chunk to call the delete_keys method of the bucket object.

for chunk in [to_delete[i:i+1000] for i in range(0, len(to_delete), 1000)]:
    result = bucket.delete_keys(chunk)
    if result.errors:
        print('The following errors occurred')
        for error in result.errors:
            print(error)

There are more efficient ways to do this (e.g. without converting the bucket generator into a list) and you probably want to do something different when handling the errors but this should give you a start.

like image 89
garnaat Avatar answered Oct 05 '22 22:10

garnaat


you can do it using aws cli : https://aws.amazon.com/cli/ and some unix command.

this aws cli commands should work:

aws s3 rm <your_bucket_name> --exclude "*" --include "*137ff24f-02c9-4656-9d77-5e761d76a273*" 

if you want to include sub-folders you should add the flag --recursive

or with unix commands:

aws s3 ls s3://<your_bucket_name>/ | awk '{print $4}' | xargs -I%  <your_os_shell>   -c 'aws s3 rm s3:// <your_bucket_name>  /% $1'

explanation: list all files on the bucket --pipe--> get the 4th parameter(its the file name) --pipe--> run delete script with aws cli

like image 21
ggcarmi Avatar answered Oct 05 '22 22:10

ggcarmi