How to specify that HTTP Status Code 304 (NotModified) is not an error condition inside the Amazon S3 GetObject API?

Tags:

Background

I am trying to make use of S3 as an 'infinity' large caching layer for some 'fairly' static XML documents. I want to ensure that the client application (which will be running on thousands of machines concurrently and requesting the XML documents many times per hour) only downloads these XML documents if their content has changed since the last time the client application downloaded them.

Approach

On Amazon S3, we can use HTTP ETAG for this. By default Amazon S3 objects have their ETAG set to the MD5 hash of the object.

We can then specify the MD5 hash of the XML document inside the GetObjectRequest.ETagToNotMatch property. This ensures that when we make the AmazonS3.GetObject call (or in my case the async version AmazonS3.BeginGetObject and AmazonS3.EndGetObject), that if the document being requested has the same MD5 hash as is contained in the GetObjectRequest.ETagToNotMatch then S3 automatically returns the HTTP Status code 304 (NotModified) and the actual contents of the XML document is not downloaded.

Problem

The problem however is that when calling AmazonS3.GetObject (or it's async equivalent) the Amazon .Net API actually sees the HTTP Status code 304 (NotModified) as an error and it retries the get request three times and then finally throws an Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3.

Obviously I could change this implementation to use AmazonS3.GetObjectMetaData and then compare the ETAG and use AmazonS3.GetObject if they do not match, but then there are two requests to S3 instead of one when the file is stale. I'd prefer to have one request regardless of whether the XML document needs downloaded or not.

Any ideas? Is this a bug or am I missing something? Is there even some way I can reduce the number of retries to one and 'process' the exception (although I feel 'yuck' about this route).

Implementation

I'm using the AWS SDK for .NET (version 1.3.14).

Here is my implementation (reduced slightly to keep it shorter):

public Task<GetObjectResponse> DownloadString(string key, string etag = null) {

    var request = new GetObjectRequest { Key = key, BucketName = Bucket };

    if (etag != null) {
        request.ETagToNotMatch = etag;
    }

    var task = Task<GetObjectResponse>.Factory.FromAsync(_s3Client.BeginGetObject, _s3Client.EndGetObject, request, null);

    return task;
}

I then call this like:

var dlTask          = s3Manager.DownloadString("new one", "d7db7bc318d6eb9222d728747879b52e");
var responseTasks   = new[]
    {
        dlTask.ContinueWith(x => _log.Error("Error downloading string.", x.Exception), TaskContinuationOptions.OnlyOnFaulted),
        dlTask.ContinueWith(x => _log.Warn("Downloading string was cancelled."), TaskContinuationOptions.OnlyOnCanceled),
        dlTask.ContinueWith(x => _log.Info(string.Format("Done with download: {0}", x.Result.ETag)), TaskContinuationOptions.OnlyOnRanToCompletion)
    };

try {
    Task.WaitAny(responseTasks);
} catch (AggregateException aex) {
    _log.Error("Error while processing download string.", aex);
}

_log.Info("Exiting...");

This then produces this log file output:

2011-10-11 13:21:20,376 [11] INFO  Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:01.6140812.
2011-10-11 13:21:20,385 [11] INFO  Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one.
2011-10-11 13:21:20,789 [11] INFO  Amazon.S3.AmazonS3Client - Retry number 1 for request GetObject.
2011-10-11 13:21:22,329 [11] INFO  Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:01.1400356.
2011-10-11 13:21:22,329 [11] INFO  Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one.
2011-10-11 13:21:23,929 [11] INFO  Amazon.S3.AmazonS3Client - Retry number 2 for request GetObject.
2011-10-11 13:21:26,508 [11] INFO  Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:00.9790314.
2011-10-11 13:21:26,508 [11] INFO  Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one.
2011-10-11 13:21:32,908 [11] INFO  Amazon.S3.AmazonS3Client - Retry number 3 for request GetObject.
2011-10-11 13:21:40,604 [11] INFO  Amazon.S3.AmazonS3Client - Received response for GetObject (id 2ee99002-d148-4572-b19b-29259534f48f) with status code NotModified in 00:00:01.2950718.
2011-10-11 13:21:40,605 [11] INFO  Amazon.S3.AmazonS3Client - Request for GetObject is being redirect to https://s3.amazonaws.com/x/new%20one.
2011-10-11 13:21:40,621 [11] ERROR Amazon.S3.AmazonS3Client - Error for GetResponse
Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3
   at Amazon.S3.AmazonS3Client.pauseOnRetry(Int32 retries, Int32 maxRetries, HttpStatusCode status, String requestAddr, WebHeaderCollection headers, Exception cause)
   at Amazon.S3.AmazonS3Client.handleHttpResponse[T](S3Request userRequest, HttpWebRequest request, HttpWebResponse httpResponse, Int32 retries, TimeSpan lengthOfRequest, T& response, Exception& cause, HttpStatusCode& statusCode)
   at Amazon.S3.AmazonS3Client.getResponseCallback[T](IAsyncResult result)
2011-10-11 13:21:40,635 [10] INFO  Example.Program - Exiting...
2011-10-11 13:21:40,638 [19] ERROR Example.Program - Error downloading string.
System.AggregateException: One or more errors occurred. ---> Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3
   at Amazon.S3.AmazonS3Client.pauseOnRetry(Int32 retries, Int32 maxRetries, HttpStatusCode status, String requestAddr, WebHeaderCollection headers, Exception cause)
   at Amazon.S3.AmazonS3Client.handleHttpResponse[T](S3Request userRequest, HttpWebRequest request, HttpWebResponse httpResponse, Int32 retries, TimeSpan lengthOfRequest, T& response, Exception& cause, HttpStatusCode& statusCode)
   at Amazon.S3.AmazonS3Client.getResponseCallback[T](IAsyncResult result)
   at Amazon.S3.AmazonS3Client.endOperation[T](IAsyncResult result)
   at Amazon.S3.AmazonS3Client.EndGetObject(IAsyncResult asyncResult)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endMethod, TaskCompletionSource`1 tcs)
   --- End of inner exception stack trace ---
---> (Inner Exception #0) Amazon.S3.AmazonS3Exception: Maximum number of retry attempts reached : 3
   at Amazon.S3.AmazonS3Client.pauseOnRetry(Int32 retries, Int32 maxRetries, HttpStatusCode status, String requestAddr, WebHeaderCollection headers, Exception cause)
   at Amazon.S3.AmazonS3Client.handleHttpResponse[T](S3Request userRequest, HttpWebRequest request, HttpWebResponse httpResponse, Int32 retries, TimeSpan lengthOfRequest, T& response, Exception& cause, HttpStatusCode& statusCode)
   at Amazon.S3.AmazonS3Client.getResponseCallback[T](IAsyncResult result)
   at Amazon.S3.AmazonS3Client.endOperation[T](IAsyncResult result)
   at Amazon.S3.AmazonS3Client.EndGetObject(IAsyncResult asyncResult)
   at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endMethod, TaskCompletionSource`1 tcs)<---

347

asked Oct 11 '11 13:10

InvertedAcceleration

1 Answers

I also posted this question on the Amazon developers forum and got a reply from an official AWS Employee:

After investigating this we understand the problem but we are looking for feedback on how best to handle this.

First approach is to have this operation return with a property on the GetObjectResponse indicating that the object was not returned or set the output stream to null. This would be cleaner to code against but it does create a slight breaking behavior for anybody relying on an exception being thrown, albeit after the 3 retries. It would also be inconsistent with the CopyObject operation which does throw an exception without all the crazy retrying.

The other option is we throw an exception similar to CopyObject which keeps us consistent and no breaking changes but it is more difficult to code against.

If anybody has opinions on which way to handle this please respond to this thread.

Norm

I have already added my thoughts to the thread, if anybody else is interested in participating here is the link:

AmazonS3.GetObject sees HTTP 304 (NotModified) as an error. Way to allow it?

NOTE: When this has been resolved by Amazon I will update my answer to reflect the outcome.

UPDATE: (2012-01-24) Still waiting for further information from Amazon.

UPDATE: (2018-12-06) this was fixed in AWS SDK 1.5.20 in 2013 https://forums.aws.amazon.com/thread.jspa?threadID=77995&tstart=0

185

answered Nov 15 '22 07:11

InvertedAcceleration

Related questions
                            
                                Is there an API for getting the Windows 7 color of an icon? [duplicate]
                            
                                How to work-around the limitations of the type inference in generic methods
                            
                                Unity3D XML(-RPC) and C#
                            
                                Is it possible to stop an sql query from executing?
                            
                                C# linq to sql - selecting tables dynamically
                            
                                How do I limit user input in a combobox, So that u can only type words that are within the collection?
                            
                                Assume always-trust is yes/true in GPG cmd
                            
                                Change tracel level dynamically thru trace listener
                            
                                Lisp syntax highlighting for ICSharpCode.TextEditor
                            
                                How do I center a TextBlock to a given position
                            
                                ASP.NET, C#, IIS, MIME TYPES, FILE UPLOAD CONDITIONAL
                            
                                C# and Moq, raise event declared in interface from abstract class mock
                            
                                That text area of nullness
                            
                                Heap memory problems
                            
                                XmlSerializer constructor with XmlTypeMapping and XmlRootAttribute arguments
                            
                                Determine which bit is set, for a date, using complex bit masks
                            
                                How can I use over by partition in LINQ?
                            
                                Error after updating to the latest version Azure SDK
                            
                                C# DataSet.Relations: How to use DataSet Relations?
                            
                                .NET MVC Custom viewengine layout

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to specify that HTTP Status Code 304 (NotModified) is not an error condition inside the Amazon S3 GetObject API?

Tags:

c#

.net

amazon-web-services

amazon-s3

task-parallel-library

InvertedAcceleration

People also ask

1 Answers

UPDATE: (2012-01-24) Still waiting for further information from Amazon.

InvertedAcceleration

Recent Activity

Donate For Us