Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS S3 Client for linux with multipart upload [closed]

What amazon s3 client do you use in linux with multipart upload feature? I have 6GB of zip files to upload and s3curl is not possible due to maximum limit of 5GB only.

Thanks. James

like image 481
James Wise Avatar asked Mar 08 '13 06:03

James Wise


People also ask

Does S3 support multipart upload?

Amazon S3 is excited to announce Multipart Upload which allows faster, more flexible uploads into Amazon S3. Multipart Upload allows you to upload a single object as a set of parts. After all parts of your object are uploaded, Amazon S3 then presents the data as a single object.

Does AWS CLI automatically performs multipart upload?

If you're using the AWS Command Line Interface (AWS CLI), then all high-level aws s3 commands automatically perform a multipart upload when the object is large.

At what size does does AWS recommend customers to use the multi part upload tool when uploading object to S3?

After all parts of your object are uploaded, Amazon S3 assembles these parts and creates the object. In general, when your object size reaches 100 MB, you should consider using multipart uploads instead of uploading the object in a single operation.


5 Answers

I use S3 Tools, it will automatically use the multipart upload feature for files larger than 15MB for all PUT commands:

Multipart is enabled by default and kicks in for files bigger than 15MB. You can set this treshold as low as 5MB (Amazon’s limit) with —multipart-chunk-size-mb=5 or to any other value between 5 and 5120 MB

Once installed and configured, just issue the following command:

~$ s3cmd put largefile.zip s3://bucketname/largefile.zip

Alternatively, you could just use split from the command-line on your zip file:

split -b1024m largefile.zip largefile.zip-

and recombine later on your filesystem using:

cat largefile.zip-* > largefile.zip

If you choose the second option, you may want to store MD5 hashes of the files prior to upload so you can verify the integrity of the archive when it's recombined later.

like image 96
Ryan Weir Avatar answered Oct 07 '22 21:10

Ryan Weir


The official AWS Command Line Interface supports multi-part upload. (It uses the boto successor botocore under the hood):

The AWS Command Line Interface (CLI) is a unified tool to manage your AWS services. With just one tool to download and configure, you can control multiple AWS services from the command line and automate them through scripts.

On top of this unified approach to all AWS APIs, it also adds a new set of simple file commands for efficient file transfers to and from Amazon S3, with characteristics similar to the well known Unix commands, e.g.:

  • ls - List S3 objects and common prefixes under a prefix or all S3 buckets.
  • cp - Copies a local file or S3 object to another location locally or in S3.
  • sync - Syncs directories and S3 prefixes.
  • ...

So cp would be sufficient for the use case at hand, but be sure to check out sync as well, it is particularly powerful for many frequently encountered scenarios (and sort of implies cp depending on the arguments).

like image 34
Steffen Opel Avatar answered Oct 07 '22 22:10

Steffen Opel


The boto library includes an s3 command line tool called s3put that can handle multipart upload of large files.

like image 42
garnaat Avatar answered Oct 07 '22 21:10

garnaat


You can have a look at the FTP/Amazon S3/Glacier client CrossFTP.

like image 40
Gatorhall Avatar answered Oct 07 '22 22:10

Gatorhall


Personally I created python file s3upload.py with simple function to upload large files using boto and multipart upload.

Now every time I need to upload large file, I just run command like this:

python s3upload.py bucketname extremely_large_file.txt

More details and function code can be found here.

like image 20
Sergey Avatar answered Oct 07 '22 23:10

Sergey