Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Faster s3 bucket duplication

I have been trying to find a better command line tool for duplicating buckets than s3cmd. s3cmd can duplicate buckets without having to download and upload each file. The command I normally run to duplicate buckets using s3cmd is:

s3cmd cp -r --acl-public s3://bucket1 s3://bucket2 

This works, but it is very slow as it copies each file via the API one at a time. If s3cmd could run in parallel mode, I'd be very happy.

Are there other options available as a command line tools or code that people use to duplicate buckets that are faster than s3cmd?

Edit: Looks like s3cmd-modification is exactly what I'm looking for. Too bad it does not work. Are there any other options?

like image 267
Sean McCleary Avatar asked Jan 11 '11 21:01

Sean McCleary


People also ask

How do I make my S3 bucket faster?

Open the AWS S3 console and click on your bucket. Click on the Metrics tab. The Total bucket size graph in the Bucket Metrics section shows the total size of the objects in the bucket.

Can an S3 bucket be duplicated?

You can use S3 Batch Operations to automate the copy process. To copy objects across AWS accounts, set up the correct cross-account permissions on the bucket and the relevant AWS Identity and Access Management (IAM) role.

How long does it take for S3 to replicate?

Most objects replicate within 15 minutes. However, sometimes replication can take a couple of hours. In rare cases, the replication can take longer. There are several factors that can affect replication time, including (but not limited to):

How fast is S3 Cross region replication?

Resolution. Cross-Region Replication is an asynchronous process, and the objects are eventually replicated. Most objects replicate within 15 minutes, but sometimes replication can take a couple hours or more.


1 Answers

AWS CLI seems to do the job perfectly, and has the bonus of being an officially supported tool.

aws s3 sync s3://mybucket s3://backup-mybucket 

http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html

Supports concurrent transfers by default. See http://docs.aws.amazon.com/cli/latest/topic/s3-config.html#max-concurrent-requests

To quickly transfer a huge number of small files, run the script from an EC2 instance to decrease latency, and increase max_concurrent_requests to reduce the impact of latency. Eg:

aws configure set default.s3.max_concurrent_requests 200 
like image 171
python1981 Avatar answered Sep 30 '22 05:09

python1981