Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

S3 bucket global uniqueness

I have been trying to reason why an S3 bucket name has to be globally unique. I came across the stackoverflow answer as well that says in order to resolve host header, bucket name got to be unique. However, my point is can't AWS direct the s3-region.amazonaws.com to region specific web server that can serve the bucket object from that region? That way the name could be globally unique only for a region. Meaning, the same bucket could be created in a different region. Please let me know if my understanding is completely wrong on how name resolution works or otherwise?

like image 692
Rahul Avatar asked Jan 08 '20 03:01

Rahul


People also ask

Are S3 buckets globally unique?

Amazon S3 supports global buckets, which means that each bucket name must be unique across all AWS accounts in all the AWS Regions within a partition. A partition is a grouping of Regions.

Why is S3 globally unique?

This is because every object in a bucket can be directly accessed using an end endpoint which contains the S3 bucket name. That is the reason names should be unique globally. However, the data is neither replicated nor stored in different regions.

Why do S3 buckets need to be unique?

Amazon S3 has a global namespace. (i.e. No two S3 buckets can have the same name.) It's similar to how DNS works where each domain name must be unique. Therefore, you need to use a unique bucket name when creating S3 buckets.

Is S3 bucket global or regional?

While the name space for buckets is global, S3 (like most of the other AWS services) runs in each AWS region (see the AWS Global Infrastructure page for more information).


2 Answers

There is not, strictly speaking, a technical reason why the bucket namespace absolutely had to be global. In fact, it technically isn't quite as global as most people might assume, because S3 has three distinct partitions that are completely isolated from each other and do not share the same global bucket namespace across partition boundaries -- the partitions are aws (the global collection of regions most people know as "AWS"), aws-us-gov (US GovCloud), and aws-cn (the Beijing and Ningxia isolated regions).

So things could have been designed differently, with each region independent, but that is irrelevant now, because the global namespace is entrenched.

But why?

The specific reasons for the global namespace aren't publicly stated, but almost certainly have to do with the evolution of the service, backwards compatibility, and ease of adoption of new regions.

S3 is one of the oldest of the AWS services, older than even EC2. They almost certainly did not foresee how large it would become.

Originally, the namespace was global of necessity because there weren't multiple regions. S3 had a single logical region (called "US Standard" for a long time) that was in fact comprised of at least two physical regions, in or near us-east-1 and us-west-2. You didn't know or care which physical region each upload went to, because they replicated back and forth, transparently, and latency-based DNS resolution automatically gave you the endpoint with the lowest latency. Many users never knew this detail.

You could even explicitly override the automatic geo-routing of DNS amd upload to the east using the s3-external-1.amazonaws.com endpoint or to the west using the s3-external-2.amazonaws.com endpoint, but your object would shortly be accessible from either endpoint.

Up until this point, S3 did not offer immediate read-after-write consistency on new objects since that would be impractical in the primary/primary, circular replication environment that existed in earlier days.

Eventually, S3 launched in other AWS regions as they came online, but they designed it so that a bucket in any region could be accessed as ${bucket}.s3.amazonaws.com. This used DNS to route the request to the correct region, based on the bucket name in the hostname, and S3 maintained the DNS mappings. *.s3.amazonaws.com was (and still is) a wildcard record that pointed everything to "S3 US Standard" but S3 would create a CNAME for your bucket that overrode the wildcard and pointed to the correct region, automatically, a few minutes after bucket creation. Until then, S3 would return a temporary HTTP redirect. This, obviously enough, requires a global bucket namespace. It still works for all but the newest regions.

But why did they do it that way? After all, at around the same time S3 also introduced endpoints in the style ${bucket}.s3-${region}.amazonaws.com ¹ that are actually wildcard DNS records: *.s3-${region}.amazonaws.com routes directly to the regional S3 endpoint for each S3 region, and is a responsive (but unusable) endpoint, even for nonexistent buckets. If you create a bucket in us-east-2 and send a request for that bucket to the eu-west-1 endpoint, S3 in eu-west-1 will throw an error, telling you that you need to send the request to us-east-2.

Also, around this time, they quietly dropped the whole east/west replication thing, and later renamed US Standard to what it really was at that point -- us-east-1. (Buttressing the "backwards compatibility" argument, s3-external-1 and s3-external-2 are still valid endpoints, but they both point to precisely the same place, in us-east-1.)

So why did the bucket namespace remain global? The only truly correct answer an outsider can give is "because that's what the decided to do."

But perhaps one factor was that AWS wanted to preserve compatibility with existing software that used ${bucket}.s3.amazonaws.com so that customers could deploy buckets in other regions without code changes. In the old days of Signature Version 2 (and earlier), the code that signed requests did not need to know the API endpoint region. Signature Version 4 requires knowledge of the endpoint region in order to generate a valid signature because the signing key is derived against the date, region, and service... but previously it wasn't like that, so you could just drop in a bucket name and client code needed no regional awareness -- or even awareness that S3 even had regions -- in order to work with a bucket in any region.

AWS is well-known for its practice of preserving backwards compatibility. They do this so consistently that occasionally some embarrassing design errors creep in and remain unfixed because to fix them would break running code.²

Another issue is virtual hosting of buckets. Back before HTTPS was accepted as non-optional, it was common to host ststic content by pointing your CNAME to the S3 endpoint. If you pointed www.example.com to S3, it would serve the content from a bucket with the exact name www.example.com. You can still do this, but it isn't useful any more since it doesn't support HTTPS. To host static S3 content with HTTPS, you use CloudFront in front of the bucket. Since CloudFront rewrites the Host header, the bucket name can be anything. You might be asking why you couldn't just point the www.example.com CNAME to the endpoint hostname of your bucket, but HTTP and DNS operate at very different layers and it simply doesn't work that way. (If you doubt this assertion, try pointing a CNAME from a domain that you control to www.google.com. You will not find that your domain serves the Google home page; instead, you'll be greeted with an error because the Google server will only see that it's received a request for www.example.com, and be oblivious to the fact that there was an intermediate CNAME pointing to it.) Virtual hosting of buckets requires either a global bucket namespace (so the Host header exactly matches the bucket) or an entirely separate mapping database of hostnames to bucket names... and why do that when you already have an established global namespace of buckets?


¹ Note that the - after s3 in these endpoints was eventually replaced by a much more logical . but these old endpoints still work.

² two examples that come to mind: (1) S3's incorrect omission of the Vary: Origin response header when a non-CORS request arrives at a CORS-enabled bucket (I have argued without success that this can be fixed without breaking anything, to no avail); (2) S3's blatantly incorrect handling of the symbol + in an object key, on the API, where the service interprets + as meaning %20 (space) so if you want a browser to download from a link to /foo+bar you have to upload it as /foo{space}bar.

like image 165
Michael - sqlbot Avatar answered Oct 22 '22 05:10

Michael - sqlbot


You create an S3 bucket in a specific region only and objects stored in a bucket is only stored in that region itself. The data is neither replicated nor stored in different regions, unless you setup replication on a per bucket basis.

However. AWS S3 shares a global name space with all accounts. The name given to an S3 bucket should be unique

This requirement is designed to support globally unique DNS names for each bucket eg. http://bucketname.s3.amazonaws.com

like image 3
Rodrigo Murillo Avatar answered Oct 22 '22 05:10

Rodrigo Murillo