Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

aws s3 putObject vs sync

i need to upload a large file to aws s3 bucket. in every 10 minute my code delete old file from source directory and generate a new file. File size is around 500 MB. Now i used s3.putObject() method for uploading each file after creation. i also heard about aws s3 sync. its coming with aws-cli. it used for uploading files to s3 bucket.

i used aws-sdk for node.js for s3 upload. aws-sdk for node.js does not contain s3-sync method. is s3-sync is better than s3.putObject() method?. i need faster upload.

like image 214
Abdul Manaf Avatar asked Aug 23 '16 05:08

Abdul Manaf


People also ask

What is difference between aws s3 CP and sync?

aws s3 cp will copy all files, even if they already exist in the destination area. It also will not delete files from your destination if they are deleted from the source. aws s3 sync looks at the destination before copying files over and only copies over files that are new and updated.

What is Putobject s3?

PDF. Adds an object to a bucket. You must have WRITE permissions on a bucket to add an object to it. Amazon S3 never adds partial objects; if you receive a success response, Amazon S3 added the entire object to the bucket.

What is aws s3 sync?

The s3 sync command synchronizes the contents of a bucket and a directory, or the contents of two buckets. Typically, s3 sync copies missing or outdated files or objects between the source and target.

What is the best way for the application to upload the large files in s3?

When you upload large files to Amazon S3, it's a best practice to leverage multipart uploads. If you're using the AWS Command Line Interface (AWS CLI), then all high-level aws s3 commands automatically perform a multipart upload when the object is large. These high-level commands include aws s3 cp and aws s3 sync.


1 Answers

There's always more than way to make on thing, so to upload a file into a S3 bucket you can :

  • use aws CLI and run aws s3 cp ...
  • use aws CLI and run aws s3api put-object ...
  • use aws SDK (your language of choice)

you can also use sync method but for a single file, there's no need to sync a whole directory, and generally when looking for better performance its better to start multiple cp instances to benefit from multi thread vs sync mono-thread.

basically all this methods are wrapper for the aws S3 API calls. From amazon doc

Making REST API calls directly from your code can be cumbersome. It requires you to write the necessary code to calculate a valid signature to authenticate your requests. We recommend the following alternatives instead:

  • Use the AWS SDKs to send your requests (see Sample Code and Libraries). With this option, you don't need to write code to calculate a signature for request authentication because the SDK clients authenticate your requests by using access keys that you provide. Unless you have a good reason not to, you should always use the AWS SDKs.
  • Use the AWS CLI to make Amazon S3 API calls. For information about setting up the AWS CLI and example Amazon S3 commands see the following topics: Set Up the AWS CLI in the Amazon Simple Storage Service Developer Guide. Using Amazon S3 with the AWS Command Line Interface in the AWS Command Line Interface User Guide.

so Amazon would recommend to use the SDK. At the end of the day, I think its really a matter to what you're most comfortable and how you will integrate this piece of code into the rest of your program. For one-time action, I always go to CLI.

In term of performance though, using one or the other will not make difference as again they're just wrapper to AWS API call. For transfer optimization, you should look at aws s3 transfer acceleration and see if you can enable it

like image 83
Frederic Henri Avatar answered Oct 25 '22 06:10

Frederic Henri