Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Should I store attachments in CouchDB or S3, instead?

I'm writing a simple web API that revolves entirely around file uploads. Users can upload files to the service via the HTTP-based API, and the service will generate files for users to access and will also need to store them along with the uploaded file. So there will be a lot of files at play.

Basically, I'm trying to decide between storing these in CouchDB and storing these in something like Amazon's S3.

With CouchDB, I'd probably have a single document for the initial uploaded file, by the user, with the attachment data inline in the _attachments collection. Additional files made by the system would be added to that document. (The service does document conversion, so they upload an Excel XLS and the system generates a PDF, TXT, etc.) I think this would be nice, because one delete on the uploaded document record will also delete the generated PDFs, TXTs, or any other attachments.

With S3, I feel the security it knowing that I'm using a hosted solution dedicated entirely to individual file storage. It also dedicates that bandwidth exclusively to those files, and it wouldn't be coming from my API web server. The downsides are that it adds a lot of additional logic to my API code, and now I have to keep a lot of remote files in sync with what my local CouchDB database knows of them. Also, I'd have to deal with request signing and stuff if I wanted end-users to access the files directly off S3. Documents are all stored individually, so deleting the user's uploaded attachment from CouchDB would require me to make several delete queries to S3 for the other files, as well.

I'm familiar with S3, and use it in a current project, but CouchDB looks really awesome in how it allows attachments. I'd love to use it, but are there any gotchas or downsides? Does CouchDB attachments make more sense than S3 in the scenario I described above, with a lot of uploaded files being stored?

like image 695
Ryan Avatar asked Nov 13 '22 15:11

Ryan


1 Answers

In my experience Database Engines get somewhat flaky when large amounts of binary objects are involved unless they are especially build for that.

I had been saving (low resolution) Images inside CouchDB and I hit a wall avter several Gigabytes of Attachments. So I moved the Attachments to S3 and never looked back.

like image 181
max Avatar answered Dec 21 '22 08:12

max