Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS S3: Recommend Nested Bucket Architecture

I am having challenges architecting my S3 bucket structure.

My Application

I have several types (roles) of users and each user has different types of PDF documents that will be uploaded to S3. The user will see each document in their dashboard and should be able to view the PDF from the application (ideally by opening in a new tab instead of downloading it). Below is an example:

User Roles

  1. role_a
  2. role_b

User Documents (for role_a)

  1. document_type_a (filename: 0888a5ce)
  2. document_type_b (filename: c00630fr)
  3. document_type_c (filename: 2349d1c)

User Documents (for role_b)

  1. document_type_x (filename: fe294090)
  2. document_type_y (filename: cad2d3dc)

Each user can have zero or more documents.

My questions:

  1. What is the most optimal way to design a nested S3 bucket structure?
  2. The filename will be saved in the database for each user. In addition to this, what other components of the S3 bucket structure should be saved in the database and what components should be derived from the application to optimize uploading and downloading of these PDF documents?
  3. In the above nested structure, what would be the bucket name and what would be the key of the document?
like image 329
alphathesis Avatar asked Dec 14 '25 05:12

alphathesis


2 Answers

There is no such thing really as a nested structure. Al files in s3 are stored as a bucket and key where the key is the structure the part you think is directories. You can confirm this by storing one document in a bucket say /foo/bar/doc.pdf. then delete that file and look at the structure in s3. Foo and bar will be gone.

So you could do things in many ways, one would be:

Bucket: mybucket

Key: /role_a/document_type_a/0888a5ce.pdf

like image 108
Kevin Brown Avatar answered Dec 17 '25 00:12

Kevin Brown


The simplest structure would be a totally flat storage structure:

  • Generate a Unique ID for each object (eg using a GUID function)
  • Save the object in S3 with a Key equal to the Unique ID
  • Store the Unique ID in a database that maps the object to your user together with metadata such as original filename, dates, permissions, etc.

You could choose to prefix each object with a user identifier, which is useful for debugging or trying to reconstruct content in case of a database failure, but there is no particular performance benefit if you are correctly referencing the database for a list of user files.

like image 37
John Rotenstein Avatar answered Dec 16 '25 23:12

John Rotenstein