I'm working on a website which allows users to upload files (pictures and otherwise). I don't have any prior experience in this area and was hoping to get some input on the right way to store and index these files.
While I would like to have an architecture that scales well to high volume data, I am not currently worrying about extremely high (facebook-, google-scale) volumes.
I was thinking of storing the files on the filesystem at
/files/{username}/
And then having a Database uploads
where each user has his own table with the filenames (and thus URLs) of each file he has uploaded (and any other extra information I might want to store).
The database end of this (giving each user his own table) seems very inefficient to me yet maintaining records of all files in a single table doesn't seem right as well as it would require searching through the entire table each time a single file is accessed.
My reasoning behind considering giving each user his own table was that it is a neat and distinct way to shard the data across tables and reduce search times when looking for a file given the user.
What Matt H suggested is a good idea if what you are trying to achieve is per user level image access. But granted that you are limited in your database stored space, storing the images at binary data is inefficient as you stated.
Using a table per user is bad design. The user who uploaded the file should simply be a field/column in the table that stores all file uploads, along with any file metadata. I suggest generating a GUID for the file name, which is guaranteed to be unique, and better than an autoincrement field which is easy to guess if you are attempting to prevent users from simply accessing all the images.
You are concerned about performance, but until you are dealing with millions upon millions of records, your queries for selecting images belong to a user, uploaded within a specific time frame (say you are storing a timestamp or similar) are minuscule in cost. If speed is an issue, you can add a B-tree index on the username, which would speed up your user specific image queries significantly.
Back on the topic of security, access and organization. Store the images with a folder per user (although depending on the number of users, the number of folders may grow to an unmanageable level). If you don't want the images to be publicly available, store them in a non-web folder, have your application read the data and stream it to render the image for the user. More complex but you hide the actual file from the internet. In addition, you would be able to validate all requests for an image by an authenticated user.
It depends on the nature and structure of your app and database. I've used many techniques, including folder-based, pictures stored in a database blob, off-web file folders accessed through an authentication gateway...
For external images that aren't directly related to the app or database, like temp photos or something, I tend to put those in a folder. Since it seems like your structure is pictures from a user, then I would expect there might be metadata associated with the image, such as tags. In that case, I would probably store the picture in a database table, assuming I had the capacity for that. If the photos needed to be secured, inaccessible to other users without authentication, then a database would have its own security, whereas a file-based storage would need some sort of trick to prevent unauthorized access.
I wouldn't use a table per user, just a table of Pictures with elements of ID, userid, picture blob.
Does that help?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With