Do we need directory structure logic for storing millions of images on Amazon S3/Cloudfront?

Tags:

In order to support millions of potential images we have previously followed this sort of directory structure:

/profile/avatars/44/f2/47/48px/44f247d4e3f646c66d4d0337c6d415eb.jpg

The filename is md5 hashed, then we extract the first 6 characters in the string and build the folder structure from that.

So in the above example the filename:

44f247d4e3f646c66d4d0337c6d415eb.jpg

produces a directory structure of:

/44/f2/47/

We always did this in order to minimize the number of photos in any single directory, ultimately to aid filesystem performance.

However our new app is using Amazon S3 with Cloudfront

My understanding is that any folders you create on Amazon S3 are actually just references and are not directories on the filesystem.

If that is correct is it still recommended to split into folders/directories in the above, or similar method? Or can we simply remove this complexity in our application code and provide image links like so:

/profile/avatars/48px/filename.jpg

Baring in mind that this app is intended to serve 10's of millions of photos.

Any guidance would be greatly appreciated.

757

asked Oct 23 '13 12:10

gordyr

2 Answers

Although S3 folders are basically only another way of writing the key name (as @E.J.Brennan already said in his answer), there are reasons to think about the naming structure of your "folders".

With your current number of photos and probably your access patterns, it might make sense to think about a way to speed up the S3 keyname lookups, making sure that operations on photos get spread out over multiple partitions. There is a great article on the AWS blog explaining all the details.

102

answered Oct 12 '22 01:10

j0nes

You don't need to setup that structure on s3 unless you are doing it for your own convenience. All of the folders you create on s3 are really just an illusion for you, the files are stored in one big continuous container, so if you don't have a reason to organize the files in a pseudo-folder hierarchy, then don't bother.

If you needed to control access to different groups of people, based on you folder struture, that might be a reason to keep the structure, but besides that there probably isn't a benefit/

answered Oct 12 '22 00:10

E.J. Brennan

Related questions
                            
                                aws-sdk : DynamoDB : Fetch list of all tables
                            
                                AWS Console Session Timeout
                            
                                How to stop an execution or set set timeout for an action in AWS CodePipeline?
                            
                                How can I set up an RStudio server to run with SSL on AWS?
                            
                                "The AWS Access Key Id you provided does not exist in our records." when trying to use AWS CLI
                            
                                AWS lambda memory usage with temporary files in python code
                            
                                Cannot Connect to AWS Database using TLS with Server CA Validation
                            
                                Uploads to S3 through CloudFront via Signed URLs?
                            
                                How to set the number of retry attempts of AWS Lambda in serverless.yml?
                            
                                One Lambda Function OR Multiple Lambda Functions
                            
                                Purpose and scope of AWS CDK bootstrap stack?
                            
                                Syncing of S3 to local directory in Python
                            
                                Deployment of servless app fails - Enable fine-grained access control or apply a restrictive access policy to your domain
                            
                                python requests lib is not working in amazon aws
                            
                                Experiences and tips for programming with and for Amazon's cloud servers/apps/tools?
                            
                                Confirm SNS subscription on HTTP
                            
                                How to play a video from Amazon S3 in Android App?
                            
                                Amazon Redshift how to copy from s3 and set a job_id
                            
                                What is the difference between Amazon AMI and EBS snapshot
                            
                                Loading credentials JSON with AWS SDK Results in Error

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Do we need directory structure logic for storing millions of images on Amazon S3/Cloudfront?

Tags:

amazon-web-services

amazon-s3

amazon-cloudfront

gordyr

People also ask

2 Answers

j0nes

E.J. Brennan

Recent Activity

Donate For Us