Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Has anyone ever reached a read or write upper-bound for an Amazon S3 bucket?

Are there known limitations of S3 scaling? Anyone ever had so many simultaneous reads or writes that a bucket started returning errors? I'm a bit more interested in writes than reads because S3 is likely to be optimized for reads.

like image 901
Paul Prescod Avatar asked Jan 30 '12 06:01

Paul Prescod


1 Answers

Eric's comment sums it up already on a conceptual level, as addressed in the FAQ What happens if traffic from my application suddenly spikes? as well:

Amazon S3 was designed from the ground up to handle traffic for any Internet application. [...] Amazon S3’s massive scale enables us to spread load evenly, so that no individual application is affected by traffic spikes.

Of course, you still need to account for possible issues and Tune [your] Application for Repeated SlowDown errors (see Amazon S3 Error Best Practices):

As with any distributed system, S3 has protection mechanisms which detect intentional or unintentional resource over-consumption and react accordingly. SlowDown errors can occur when a high request rate triggers one of these mechanisms. Reducing your request rate will decrease or eliminate errors of this type. Generally speaking, most users will not experience these errors regularly; however, if you would like more information or are experiencing high or unexpected SlowDown errors, please post to our Amazon S3 developer forum http://developer.amazonwebservices.com/connect/forum.jspa?forumID=24 or sign up for AWS Premium Support http://aws.amazon.com/premiumsupport/. [emphasis mine]

While rare, these slow downs do happen of course - here is an AWS team response illustrating the issue (pretty dated though):

Amazon S3 will return this error when the request rate is high enough that servicing the requests would cause degraded service for other customers. This error is very rarely triggered. If you do receive it, you should exponentially back off. If this error occurs, system resources will be reactively rebalanced/allocated to better support a higher request rate. As a result, the time period during which this error would be thrown should be relatively short. [emphasis mine]

Your assumption about read vs. write optimization is confirmed there as well:

The threshold where this error is trigged varies and will depend, in part, on the request type and pattern. In general, you'll be able to achieve higher rps with gets vs. puts and with lots of gets for a small number of keys vs. lots of gets for a large number of keys. When geting or puting a large number of keys you'll be able to achieve higher rps if the keys are in alphanumeric order vs. random/hashed order.

like image 114
Steffen Opel Avatar answered Oct 01 '22 20:10

Steffen Opel