Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cheapest way to delete 2 billion objects from S3 IA

I have a bucket in S3 (Infrequent access) containing 2 billion objects. It is too big to delete in the console or over the api without taking years.

I can create a lifecycle rule to expire and delete the objects but the calculator predicts this will cost me >$20,000. Is that correct? Is there a better way to delete a bucket?

I have a file effectively containing a list of all the objects in that bucket if that helps.

Update 2021:

An answer below from @MAP points out that there is now an "Empty" button. I haven't tested yet, but looks like the way to go (I'll accept that answer once tested):

screenshot of empty button

like image 602
matt burns Avatar asked Jan 18 '19 14:01

matt burns


People also ask

What is the best way to delete multiple objects from S3?

Navigate to the Amazon S3 bucket or folder that contains the objects that you want to delete. Select the check box to the left of the names of the objects that you want to delete. Choose Actions and choose Delete from the list of options that appears. Alternatively, choose Delete from the options in the upper right.

How do you quickly delete objects in S3 bucket?

If the versioning is disabled, you can run the aws s3 rm CLI command to delete all objects in the S3 bucket. If versioning is enabled, you run the CLI command aws s3api delete-objects to delete all versioned objects in the S3 bucket. Once the S3 bucket is empty you can then proceed to delete it.

Is S3 delete operation free?

Reference the S3 developer guide for technical details on the following request types: PUT, COPY, POST, LIST, GET, SELECT, Lifecycle Transition, and Data Retrievals. DELETE and CANCEL requests are free. LIST requests for any storage class are charged at the same rate as S3 Standard PUT, COPY, and POST requests.

How do you delete more than 1000 objects on Galaxy S3?

You cannot delete more than 1000 objects in one API call. If you want to delete more than 1000 keys you'll need to make multiple calls.


4 Answers

If you have a list of all the objects available then you can certainly use Multi Delete Object action. Apparently this API is free. I would create AWS Step Functions state machine to loop through the file and delete 1000 objects at a time. 1000 appears to be the limit.

It will take around 2M step function transactions to delete all the objects in the bucket. As per the pricing for step function it will cost you around $50 + cost of Lambda invocations around $1 so total cost roughly $51.

Update

Using Lambda or Step Functions is probably not the most cost effective option because both ways you will need to read the file (that contains object keys) from some source such as S3. So I think running the script from local machine or any EC2 linux screen appears to be the best option.

like image 170
A.Khan Avatar answered Oct 12 '22 12:10

A.Khan


In 2021, anyone who comes across this question may benefit to know that AWS console now provides an empty button.

Select the bucket and click on "empty" button and all objects versioned or not versioned would be emptied/deleted. Depending on the number of objects it can take minutes to days.

like image 32
MAP Avatar answered Oct 12 '22 12:10

MAP


Expiration lifecycle rules are free. From the original feature announcement:

As with standard delete requests, Amazon S3 doesn’t charge you for using Object Expiration.

like image 23
PencilBow Avatar answered Oct 12 '22 12:10

PencilBow


Delete operations are for free. You can create a lifecycle Policy to automate a bulk delete.

I would start with a small number of objects first and check billing report to 100% confirm that the delete will not be charged, then go for the rest.

like image 23
Sébastien Stormacq Avatar answered Oct 12 '22 13:10

Sébastien Stormacq