Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Amazon S3 cache audio files

I have created new music application and I store all mp3 files on Amazon S3. Before moving to S3 I used store them on server file system itself. It used to cache files and on consecutive reload of page files weren't downloaded from server. But after moving to S3 everytime I load page it downloads files from S3. This not only making my app slow but every request to S3 is money. I found some documentation on cache-control and I tried them all but no success. I might be missing something here. Any help is appreciated. Thanks.

Here is my code for uploading mp3 files on S3. I use CarrierWave with Rails.

CarrierWave.configure do |config|
    config.fog_credentials = {
      :provider               => 'AWS',
      :aws_access_key_id      =>  MyAppConfig.config['aws']['aws_access_key'],
      :aws_secret_access_key  => MyAppConfig.config['aws']['aws_secret_key'],
    }
    config.fog_directory  = MyAppConfig.config['aws']['aws_bucket_name']
    config.fog_public     = false
    config.storage = :fog
    config.fog_attributes = {'Cache-Control'=>'max-age=315576000'}
end
like image 808
pramodtech Avatar asked Dec 23 '13 11:12

pramodtech


3 Answers

If you're using signed URLs, which you say you are in the comments, and not reusing those signed URLs then there is no way to cache these requests.

Amazon Web Services cannot override your Web browser's internal cache system. When two URIs are unique, as they are with signed URLs, then your Web browser treats them as two distinct resources on the Internet.

For example, let's take:

http://www.example.com/song1.mp3
http://www.example.com/song2.mp3

These are two discrete URIs. Even if song1.mp3 and song2.mp3 had the same Etag and Content-Length HTTP response headers, they're still two different resources.

The same is true if we merely alter their query strings:

http://www.example.com/song1.mp3?a=1&b=2&c=3
http://www.example.com/song1.mp3?a=1&b=2&c=4

These are still two discrete URIs. They will not reference one another for purposes of caching. This is the principle behind using query strings to override caching.

No amount of fiddling with HTTP headers will ever get you the cache behavior you're seeking.

like image 179
Jacob Budin Avatar answered Sep 24 '22 20:09

Jacob Budin


Take a look at http://www.bucketexplorer.com/documentation/amazon-s3--how-to-set-cache-control-header-for-s3-object.html

Set Cache-Control for already uploaded file on S3 using Update Metadata:

1) Run Bucket Explorer and login with your credentials.

2) After listing all Buckets, select any S3 Bucket.

3) It will list all objects of selected S3 Bucket.

4) Select any file and right click on the objects and select “Update Metadata” option.

5) Add Key and Value in Metadata attributes. Enter Key: “Cache-Control” with Value: “max-age = (time for which you want your object to be accessed from cache in seconds)”

6) Click on Save button. It will update metadata as Cache-Control on all selected S3 objects.

Example to set max-age: For time limit of 15 days = 3600 * 24 * 15 = 1296000 sec. Set Key = “Cache-Control” value = “max-age=1296000”

Note: If object is HTML file, set Key: “Cache-Control” and value: max-age = (time for which you want your object to be accessed from cache in seconds), must-revalidate “i.e. Key: “Cache-Control” value: max-age = “2592000, must re-validate” for 30 days. “must re-validate” string must be added after the time in second as value.

like image 21
MZaragoza Avatar answered Sep 22 '22 20:09

MZaragoza


Assuming you have properly set the cache control headers and you are using signed URL's than you will need to hold onto the signed URL for a given file and re-render the exact same URL in subsequent page loads.

If you have not set the cache control headers or you want them to change based on who is making the request you can set them before signing your URL with: &response-cache-control=value or &response-expires=value.

like image 27
Larry McKenzie Avatar answered Sep 23 '22 20:09

Larry McKenzie