Any ideas?
You could implement the front-end in pretty much anything that you can code to speak native S3 multipart upload... which is the approach I'd recommend for this, because of stability.
With a multipart upload, "you" (meaning the developer, not the end user, I would suggest) choose a part size, minimum 5MB per part, and the file can be no larger that 10,000 "parts", each exactly the same size (the one "you" selected at the beginning of the upload, except for the last part, which would be however many bytes are left over at the end... so the ultimatel maximum size of the uploaded file depends on the part-size you choose.
The size of a "part" essentially becomes your restartable/retryable block size (win!)... so your front-end implementation can infinitely resend a failed part until it goes through correctly. Parts don't even have to be uploaded in order, they can be uploaded in parallel, and if you upload the same part more than once, the newer one replaces the older one, and with each block, S3 returns a checksum that you compare to your locally calculated one. The object doesn't become visible in S3 until you finalize the upload. When you finalize the upload, if S3 hasn't got all the parts (which is should, because they were all acknowledged when they uploaded) then the finalize call will fail.
The one thing you do have to keep in mind, though, is that multipart uploads apparently never time out, and if they are "never" either finalized/completed nor actively aborted by the client utility, you will pay for the storage of the uploaded blocks of the incomplete uploads. So, you want to implement an automated back-end process that periodically calls ListMultipartUploads to identify and abort those uploads that for whatever reason were never finished or canceled, and abort them.
I don't know how helpful this is as an answer to your overall question, but developing a custom front-end tool should not be a complicated matter -- the S3 API is very straightforward. I can say this, because I developed a utility to do this (for my internal use -- this isn't a product plug). I may one day release it as open source, but it likely wouldn't suit your needs anyway -- its essentially a command-line utility that can be used by automated/scheduled processes to stream ("pipe") the output of a program directly into S3 as a series of multipart parts (the files are large, so my default part-size is 64MB), and when the input stream is closed by the program generating the output, it detects this and finalizes the upload. :) I use it to stream live database backups, passed through a compression program, directly into S3 as they are generated, without ever needing those massive files to exist anywhere on any hard drive.
Your desire to have a smooth experience for your clients, in my opinion, highly commends S3 multipart for the role, and if you know how to code in anything that can generate a desktop or browser-based UI, can read local desktop filesystems, and has libraries for HTTP and SHA/HMAC, then you can write a client to do this that looks and feels exactly the way you need it to.
You wouldn't need to set up anything manually in AWS for each client, so long as you have a back-end system that authenticates the client utility to you, perhaps by a username and password sent over an SSL connection to an application on a web server, and then provides the client utility with automatically-generated temporary AWS credentials that the client utility can use to do the uploading.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With