Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

S3 Object Expiration using boto

I was trying to figure out a way to clean up my s3 bucket. I want to delete all the keys that are older than X days ( In my case X is 30 days).

I couldn't figure out a way to delete the objects in s3.

I used the following approaches, none of which worked (By worked, I mean I tried getting the object after X days, and s3 was still serving the object. I was expecting "Object not found" or "Object expired" message

Approach 1:

    k = Key(bucket)
    k.key = my_key_name
    expires = datetime.utcnow() + timedelta(seconds=(10))
    expires = expires.strftime("%a, %d %b %Y %H:%M:%S GMT")
    k.set_contents_from_filename(filename,headers={'Expires':expires})

Approach 2:

    k = Key(bucket)
    k.key = "Event_" + str(key_name) + "_report"
    expires = datetime.utcnow() + timedelta(seconds=(10))
    expires = expires.strftime("%a, %d %b %Y %H:%M:%S GMT")
    k.set_meta_data('Expires', expires)
    k.set_contents_from_filename(filename)

If anyone can share the code that was working for them, which deletes s3 objects, that would be really great

like image 957
user2005798 Avatar asked Feb 19 '13 23:02

user2005798


3 Answers

You can use lifecycle policies to delete objects from s3 that are older than X days. For example, suppose you have these objects:

logs/first
logs/second
logs/third
otherfile.txt

To expire everything under logs/ after 30 days, you'd say:

import boto
from boto.s3.lifecycle import (
    Lifecycle,
    Expiration,
)

lifecycle = Lifecycle()
lifecycle.add_rule(
    'rulename',
     prefix='logs/',
     status='Enabled',
     expiration=Expiration(days=30)
)

s3 = boto.connect_s3()
bucket = s3.get_bucket('boto-lifecycle-test')
bucket.configure_lifecycle(lifecycle)

You can also retrieve the lifecycle configuration:

>>> config = bucket.get_lifecycle_config()
>>> print(config[0])
<Rule: ruleid>
>>> print(config[0].prefix)
logs/
>>> print(config[0].expiration)
<Expiration: in: 30 days>
like image 134
jamesls Avatar answered Nov 15 '22 20:11

jamesls


The answer by jamesis is using boto which is an older version and will be deprecated. The current supported version is boto3.

The same expiration policy on the logs folder can be done as follows:

import boto3
from botocore.exceptions import ClientError

client = boto3.client('s3')
try:
    policy_status = client.put_bucket_lifecycle_configuration(
               Bucket='boto-lifecycle-test',
               LifecycleConfiguration={
                    'Rules': 
                           [
                             {
                             'Expiration':
                                {
                                 'Days': 30,
                                 'ExpiredObjectDeleteMarker': True
                                },
                             'Prefix': 'logs/',
                             'Filter': {
                               'Prefix': 'logs/',
                             },
                             'Status': 'Enabled',
                            }
                        ]})
except ClientError as e:
     print("Unable to apply bucket policy. \nReason:{0}".format(e))

This will override any existing lifecycle configuration policy on logs.

A good thing to do would be to check if the bucket exists and if you have the permissions to access it before applying the expiration configuration i.e. before the try-except

bucket_exists = client.head_bucket(
   Bucket='boto-lifecycle-test'
)

Since the logs folder itself isn't a bucket but rather an object within the bucket boto-lifecycletest, the bucket itself can have a different expiration policy. You can check this from the result in policy_exists as below.

policy_exists = client.get_bucket_lifecycle_configuration(
    Bucket='boto-lifecycle-test')
bucket_policy = policy_exists['Rules'][0]['Expiration']

More information about setting the expiration policy can be checked at Expiry policy

like image 27
Vaulstein Avatar answered Nov 15 '22 21:11

Vaulstein


The above python script by Vaulstein throws a Malformed XML exception. Please remove extra "," at the end of line "'Status': 'Enabled',".

like image 20
Akash Shukla Avatar answered Nov 15 '22 21:11

Akash Shukla