AWS S3: Recommend Nested Bucket Architecture

Question

I am having challenges architecting my S3 bucket structure.

My Application

I have several types (roles) of users and each user has different types of PDF documents that will be uploaded to S3. The user will see each document in their dashboard and should be able to view the PDF from the application (ideally by opening in a new tab instead of downloading it). Below is an example:

User Roles

role_a
role_b

User Documents (for role_a)

document_type_a (filename: 0888a5ce)
document_type_b (filename: c00630fr)
document_type_c (filename: 2349d1c)

User Documents (for role_b)

document_type_x (filename: fe294090)
document_type_y (filename: cad2d3dc)

Each user can have zero or more documents.

My questions:

What is the most optimal way to design a nested S3 bucket structure?
The filename will be saved in the database for each user. In addition to this, what other components of the S3 bucket structure should be saved in the database and what components should be derived from the application to optimize uploading and downloading of these PDF documents?
In the above nested structure, what would be the bucket name and what would be the key of the document?

Kevin Brown · Accepted Answer

There is no such thing really as a nested structure. Al files in s3 are stored as a bucket and key where the key is the structure the part you think is directories. You can confirm this by storing one document in a bucket say /foo/bar/doc.pdf. then delete that file and look at the structure in s3. Foo and bar will be gone.

So you could do things in many ways, one would be:

Bucket: mybucket

Key: /role_a/document_type_a/0888a5ce.pdf

So you could do things in many ways, one would be:

Bucket: mybucket

Key: /role_a/document_type_a/0888a5ce.pdf

John Rotenstein · Answer

The simplest structure would be a totally flat storage structure:

Generate a Unique ID for each object (eg using a GUID function)
Save the object in S3 with a Key equal to the Unique ID
Store the Unique ID in a database that maps the object to your user together with metadata such as original filename, dates, permissions, etc.

You could choose to prefix each object with a user identifier, which is useful for debugging or trying to reconstruct content in case of a database failure, but there is no particular performance benefit if you are correctly referencing the database for a list of user files.

AWS S3: Recommend Nested Bucket Architecture

Tags:

pdf

amazon-web-services

amazon-s3

document

bucket

alphathesis

2 Answers

Kevin Brown

John Rotenstein

Recent Activity

Donate For Us

AWS S3: Recommend Nested Bucket Architecture

Tags:

pdf

amazon-web-services

amazon-s3

document

bucket

alphathesis

2 Answers

Kevin Brown

John Rotenstein

Related questions

Recent Activity

Donate For Us