Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tracking Upload Progress of File to S3 Using Ruby aws-sdk

Firstly, I am aware that there are quite a few questions that are similar to this one in SO. I have read most, if not all of them, over the past week. But I still can't make this work for me.

I am developing a Ruby on Rails app that allows users to upload mp3 files to Amazon S3. The upload itself works perfectly, but a progress bar would greatly improve user experience on the website.

I am using the aws-sdk gem which is the official one from Amazon. I have looked everywhere in its documentation for callbacks during the upload process, but I couldn't find anything.

The files are uploaded one at a time directly to S3 so it doesn't need to load it into memory. No multiple file upload necessary either.

I figured that I may need to use JQuery to make this work and I am fine with that. I found this that looked very promising: https://github.com/blueimp/jQuery-File-Upload And I even tried following the example here: https://github.com/ncri/s3_uploader_example

But I just could not make it work for me.

The documentation for aws-sdk also BRIEFLY describes streaming uploads with a block:

  obj.write do |buffer, bytes|
     # writing fewer than the requested number of bytes to the buffer
     # will cause write to stop yielding to the block
  end

But this is barely helpful. How does one "write to the buffer"? I tried a few intuitive options that would always result in timeouts. And how would I even update the browser based on the buffering?

Is there a better or simpler solution to this?

Thank you in advance. I would appreciate any help on this subject.

like image 267
DaedalusCoder Avatar asked Aug 23 '12 06:08

DaedalusCoder


People also ask

How fast is S3 upload?

Upload speed to AWS S3 tops out at 2.3 Mbps. Tried the multipart upload but even with 10 concurrent threads, the total speed remains the same, just gets split between all threads ~20 KB/s.

How do I view uploaded files on AWS S3?

In AWS Explorer, expand the Amazon S3 node, and double-click a bucket or open the context (right-click) menu for the bucket and choose Browse. In the Browse view of your bucket, choose Upload File or Upload Folder. In the File-Open dialog box, navigate to the files to upload, choose them, and then choose Open.

Does S3 transfer acceleration support multipart upload?

S3 supports multipart uploads for large files. For example: using this feature, you can break a 5 GB upload into as many as 1024 separate parts and upload each one independently, as long as each part has a size of 5 megabytes (MB) or more.

What happens if you upload the same file to S3?

If you pass the same key to upload a file, it is replaced, unless versioning is on. S3 supports versioning. This means that when you upload to the same key twice, two versions of the file are stored. Note that if you upload the exact same file twice, you get to pay for two identical copies of the same file on S3.


2 Answers

The "buffer" object yielded when passing a block to #write is an instance of StringIO. You can write to the buffer using #write or #<<. Here is an example that uses the block form to upload a file.

file = File.open('/path/to/file', 'r')

obj = s3.buckets['my-bucket'].objects['object-key']
obj.write(:content_length => file.size) do |buffer, bytes|
  buffer.write(file.read(bytes))
  # you could do some interesting things here to track progress
end

file.close
like image 137
Trevor Rowe Avatar answered Oct 13 '22 18:10

Trevor Rowe


After read the source code of the AWS gem, I've adapted (or mostly copy) the multipart upload method to yield the current progress based on how many chunks have been uploaded

s3 = AWS::S3.new.buckets['your_bucket']

file = File.open(filepath, 'r', encoding: 'BINARY')
file_to_upload = "#{s3_dir}/#{filename}"
upload_progress = 0

opts = {
  content_type: mime_type,
  cache_control: 'max-age=31536000',
  estimated_content_length: file.size,
}

part_size = self.compute_part_size(opts)

parts_number = (file.size.to_f / part_size).ceil.to_i
obj          = s3.objects[file_to_upload]

begin
    obj.multipart_upload(opts) do |upload|
      until file.eof? do
        break if (abort_upload = upload.aborted?)

        upload.add_part(file.read(part_size))
        upload_progress += 1.0/parts_number

        # Yields the Float progress and the String filepath from the
        # current file that's being uploaded
        yield(upload_progress, upload) if block_given?
      end
    end
end

The compute_part_size method is defined here and I've modified it to this:

def compute_part_size options

  max_parts = 10000
  min_size  = 5242880 #5 MB
  estimated_size = options[:estimated_content_length]

  [(estimated_size.to_f / max_parts).ceil, min_size].max.to_i

end

This code was tested on Ruby 2.0.0p0

like image 41
emartini Avatar answered Oct 13 '22 17:10

emartini