How To Delete S3 Files Starting With

Tags:

Let's say I have images of different sizes on S3:

137ff24f-02c9-4656-9d77-5e761d76a273.webp
137ff24f-02c9-4656-9d77-5e761d76a273_500_300.webp
137ff24f-02c9-4656-9d77-5e761d76a273_400_280.webp

I am using boto to delete a single file:

Click to copy

bucket = get_s3_bucket()
s3_key = Key(bucket)
s3_key.key = '137ff24f-02c9-4656-9d77-5e761d76a273.webp'
bucket.delete_key(s3_key)

But I would like to delete all keys starting with 137ff24f-02c9-4656-9d77-5e761d76a273.

Keep in mind there might be hundreds of files in the bucket so I don't want to iterate over all files. Is there a way to delete only files starting with certain string?

Maybe some regex delete function.

650

asked Feb 19 '14 13:02

Richard Knop

2 Answers

The S3 service does support a multi-delete operation allowing you to delete up to 1000 objects in a single API call. However, this API call doesn't provide support for server-side filtering of the keys. You have to provide the list of keys you want to delete.

You could roll your own. First, you would want to get a list of all the keys you want to delete.

Click to copy

import boto

s3 = boto.connect_s3()
bucket = s3.get_bucket('mybucket')
to_delete = list(bucket.list(prefix='137ff24f-02c9-4656-9d77-5e761d76a273'))

The list call returns a generator but I'm converting that to a list using list so, the to_delete variable now points to list of all of the objects in the bucket that match the prefix I have provided.

Now, we need to create chunks of up to 1000 objects from the big list and use the chunk to call the delete_keys method of the bucket object.

Click to copy

for chunk in [to_delete[i:i+1000] for i in range(0, len(to_delete), 1000)]:
    result = bucket.delete_keys(chunk)
    if result.errors:
        print('The following errors occurred')
        for error in result.errors:
            print(error)

There are more efficient ways to do this (e.g. without converting the bucket generator into a list) and you probably want to do something different when handling the errors but this should give you a start.

answered Oct 05 '22 22:10

garnaat

you can do it using aws cli : https://aws.amazon.com/cli/ and some unix command.

this aws cli commands should work:

Click to copy

aws s3 rm <your_bucket_name> --exclude "*" --include "*137ff24f-02c9-4656-9d77-5e761d76a273*"

if you want to include sub-folders you should add the flag --recursive

or with unix commands:

Click to copy

aws s3 ls s3://<your_bucket_name>/ | awk '{print $4}' | xargs -I%  <your_os_shell>   -c 'aws s3 rm s3:// <your_bucket_name>  /% $1'

explanation: list all files on the bucket --pipe--> get the 4th parameter(its the file name) --pipe--> run delete script with aws cli

answered Oct 05 '22 22:10

ggcarmi

Related questions
                            
                                Reload in Python interpreter
                            
                                Is there any difference between `if bool(x)` and `if x` in Python?
                            
                                Python: how to put constructors in map() function?
                            
                                Are *parameters calls lazy? [duplicate]
                            
                                Plot image color histogram using matplotlib
                            
                                REST post using Python-Request
                            
                                How do I output a list of dictionaries to an Excel sheet?
                            
                                Python isnumeric function works only on unicode
                            
                                What could be the reason for a socket error "[Errno 9] Bad file descriptor"
                            
                                "UnboundLocalError: local variable referenced before assignment" when incrementing variable in function [duplicate]
                            
                                How to check if two keys in dictionary hold the same value
                            
                                Django: DateField "This field cannot be blank."
                            
                                Make two directories static in django
                            
                                Why is numpy.random.choice so slow?
                            
                                CORS - Using AJAX to post on a Python (webapp2) web service
                            
                                How to install latest version of Django 1.5 using pip?
                            
                                Floating Point Numbers [duplicate]
                            
                                How can Tornado serve a single static file at an arbitrary location?
                            
                                ReferenceError: "something" is not defined in QML
                            
                                EVE - define custom flask controllers [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How To Delete S3 Files Starting With

Tags:

python

amazon-web-services

amazon-s3

boto

Richard Knop

People also ask

2 Answers

garnaat

ggcarmi

Recent Activity

Donate For Us