Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Trouble setting cache-cotrol header for Amazon S3 key using boto

My Django project uses django_compressor to store JavaScript and CSS files in an S3 bucket via boto via the django-storages package.

The django-storages-related config includes

if 'AWS_STORAGE_BUCKET_NAME' in os.environ:
    AWS_STORAGE_BUCKET_NAME = os.environ['AWS_STORAGE_BUCKET_NAME']
    AWS_HEADERS = {
        'Cache-Control': 'max-age=100000',
        'x-amz-acl': 'public-read',
    }
    AWS_QUERYSTRING_AUTH = False

    # This causes images to be stored in Amazon S3
    DEFAULT_FILE_STORAGE = 'storages.backends.s3boto.S3BotoStorage'

    # This causes CSS and other static files to be served from S3 as well.
    STATICFILES_STORAGE = 'storages.backends.s3boto.S3BotoStorage'
    STATIC_ROOT = ''
    STATIC_URL = 'https://{0}.s3.amazonaws.com/'.format(AWS_STORAGE_BUCKET_NAME)

    # This causes conpressed CSS and JavaScript to also go in S3
    COMPRESS_STORAGE = STATICFILES_STORAGE
    COMPRESS_URL = STATIC_URL

This works except that when I visit the objects in the S3 management console I see the equals sign in the Cache-Control header has been changed to %3D, as in max-age%3D100000, and this stops caching from working.

I wrote a little script to try to fix this along these lines:

max_age = 30000000
cache_control = 'public, max-age={}'.format(max_age)

con = S3Connection(settings.AWS_ACCESS_KEY_ID, settings.AWS_SECRET_ACCESS_KEY)
bucket = con.get_bucket(settings.AWS_STORAGE_BUCKET_NAME)
for key in bucket.list():
    key.set_metadata('Cache-Control', cache_control)

but this does not change the metadata as displayed in Amazon S3 management console.

(Update. The documentation for S3 metadata says

After you upload the object, you cannot modify object metadata. The only way to modify object metadata is to make copy of the object and set the metadata. For more information, go to PUT Object - Copy in the Amazon Simple Storage Service API Reference. You can use the Amazon S3 management console to update the object metadata but internally it makes an object copy replacing the existing object to set the metadata.

so perhaps it is not so surprising that I can’t set the metadata. I assume get_metadata is only used when creating the data in the first place.

end update)

So my questions are, first, can I configure django-storages so that it creates the cache-control header correctly in the first place, and second, is the metadata set with set_metadata the same as the metadata viewed with S3 management console and if not what is the latter and how do I set it programatically?

like image 941
pdc Avatar asked Feb 15 '23 22:02

pdc


2 Answers

Use ASCII string as values solves this for me.

AWS_HEADERS = {'Cache-Control': str('public, max-age=15552000')}
like image 140
Cloudream Avatar answered May 09 '23 09:05

Cloudream


If you want to add cache control while uploading the file....

 headers = {
    'Cache-Control':'max-age=604800', # 60 x 60 x 24 x 7 = 1 week
    'Content-Type':content_type,
  }

  k = Key(self.get_bucket())
  k.key = filename
  k.set_contents_from_string(contents.getvalue(), headers)
  if self.public: k.make_public()

If you want to add cache control to existing files...

for key in bucket.list():
  print key.name.encode('utf-8')
  metadata = key.metadata
  metadata['Cache-Control'] = 'max-age=604800' # 60 x 60 x 24 x 7 = 1 week
  key.copy(AWS_BUCKET, key, metadata=metadata, preserve_acl=True)

This is tested in boto 2.32 - 2.40.

like image 39
Paul Kenjora Avatar answered May 09 '23 10:05

Paul Kenjora