Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficiently move many small files to Amazon S3

I have around 60,000 small image files (total size 200mb) that I would like to move out of my project repository to Amazon S3.

I have tried s3fs (http://code.google.com/p/s3fs/), mounting S3 via Transmit on Mac OS X as well as the Amazon AWS S3 web uploader. Unfortunately it seems like all of these would take a very long time, more than a day or two, to accomplish the task.

Is there any better way?

like image 402
Denny Avatar asked Dec 28 '11 00:12

Denny


1 Answers

There are a few things that could be limiting the flow of data and each has a different way to alleviate it:

  1. Your transfer application might be adding overhead. If s3fs is too slow, you might try other options like the S3 tab on the AWS console or a tool like s3cmd.

  2. The network latency between your computer and S3 and the latency in API call responses can be a serious factor in how much you can do in a single thread. The key to solving this is to upload multiple files (dozens) in parallel.

  3. You could just have a slow network connection between you and S3, placing a limit on the total data transfer speed possible. If you can compress the files, you could upload them in compressed form to a temporary EC2 instance and then uncompress and upload from the instance to S3.

My bet is on number 2 which is not always the easiest to solve unless you have upload tools that will parallelize for you.

like image 53
Eric Hammond Avatar answered Oct 12 '22 15:10

Eric Hammond