Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best practices for storing references to AWS S3 objects in a database?

We store files in Amazon AWS S3, and want to keep references to those files in a Document table in Postgres. I am looking for best practices. We use Python/Django, and currently store the URL that comes back from boto3.s3.key.Key().generate_url(...). But so many issues with that:

  • Must parse the bucket and key out of the URL.
  • Need to urldecode the key name.
  • Doesn't support object versioning.
  • Unicode support is easy to mess up, esp due to the urlencode/decode steps.

So, I'm considering storing the Bucket, Key, and Version in three separate fields, and creating the Key as a combination of the DB primary key plus a safely-encoded filename, but didn't know if there were better approaches?

like image 883
Scott Stafford Avatar asked Nov 13 '17 16:11

Scott Stafford


People also ask

What is the best practice for storing report data in S3?

Consider splitting read, write, and delete access. Allow only write access to users or services that generate and write data to S3 but don't need to read or delete objects. Define an S3 lifecycle policy to remove objects on a schedule instead of through manual intervention— see Managing your storage lifecycle.

What does Amazon S3 call the place where it stores data as objects within resources?

Data is stored as objects within resources called “buckets”, and a single object can be up to 5 terabytes in size.

Which S3 storage class would you recommend for data archival?

One Zone-IA, Glacier and Glacier Deep Archive are the most appropriate Amazon S3 storage classes for long-term archival. The Glacier tiers are the best for information that must be retained for years due to tax laws and regulatory guidelines.


1 Answers

Not sure if best-est approach, but we store:

  • unique object ID (might be UUID) in database (for which Postgres has a native UUID type)
  • bucket name and path in configuration (as we store all the objects of the same type under the same bucket+path)

That way you can at least:

  • Move objects to a different bucket / path without havig to rewrite your whole database table
  • Switch from S3 to local storage if you choose so
  • Throw away your primary key (e.g. while partitioning tables) without loosing track of your objects
like image 136
Linas Valiukas Avatar answered Sep 20 '22 13:09

Linas Valiukas