Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS CLI S3 sync only over selected files?

I need synchronize two AWS S3 buckets, but I need sync only the files in a list. This is the scenario:

BucketA:

File1.jpg Deleted  
File2.jpg Modified
File3.jpg Deleted
File4.jpg Modified
File5.jpg Modified
File6.jpg New

BucketB:

File1.jpg 
File2.jpg 
File3.jpg 
File4.jpg 
File5.jpg 

I'm looking for a command like this:

aws s3 sync s3://BucketA s3://BucketB --delete --exclude "*" --include "File1.jpg;File2.jpg;File4.jpg"

The result BucketB must be like this:

File1.jpg deleted
File2.jpg Modified
File3.jpg No changed
File4.jpg Modified
File5.jpg No changed

Any idea?

like image 673
Pau Dominguez Avatar asked Jul 21 '15 10:07

Pau Dominguez


People also ask

Does aws S3 sync overwrite?

The AWS Command-Line Interface (CLI) aws s3 sync command copies content from the Source location to the Destination location. It only copies files that have been added or changed since the last sync. It is designed as a one-way sync, not a two-way sync.

How does S3 sync command work?

The s3 sync command synchronizes the contents of a bucket and a directory, or the contents of two buckets. Typically, s3 sync copies missing or outdated files or objects between the source and target.

Does S3 sync use multipart upload?

This example uses the command aws s3 cp, but other aws s3 commands that involve uploading objects into an S3 bucket (for example, aws s3 sync or aws s3 mv) also automatically perform a multipart upload when the object is large.

Is aws S3 sync recursive?

Syncs directories and S3 prefixes. Recursively copies new and updated files from the source directory to the destination. Only creates folders in the destination if they contain one or more files.


1 Answers

It looks like this is achievable, except for the deletion part.

This command will sync only the specified files:

aws s3 sync s3://bucketA s3://bucketB --exclude "*" --include "File1.jpg" --include "File2.jpg" --include "File4.jpg"

However, the --delete parameter seems to only look at the files in BucketA that are included in the --include parameter, causing all other files to 'invisible' and therefore deleted from BucketB.

This command:

aws s3 sync s3://bucketA s3://bucketB --delete --exclude "*" --include "File1.jpg" --include "File2.jpg" --include "File4.jpg"

actually deletes all files except File2.jpg and File4.jpg. So, it doesn't look like you can do a selective delete in the expected manner.

Here's a script to test all of the above:

aws s3 cp foo s3://bucketa/File1.jpg
aws s3 cp foo s3://bucketa/File2.jpg
aws s3 cp foo s3://bucketa/File3.jpg
aws s3 cp foo s3://bucketa/File4.jpg
aws s3 cp foo s3://bucketa/File5.jpg
aws s3 sync s3://bucketa s3://bucketb
aws s3 rm s3://bucketa/File1.jpg
aws s3 rm s3://bucketa/File3.jpg
aws s3 cp foo s3://bucketa/File6.jpg
aws s3 cp bar s3://bucketa/File2.jpg
aws s3 cp bar s3://bucketa/File4.jpg
aws s3 cp bar s3://bucketa/File5.jpg

aws s3 ls s3://bucketa
2015-07-23 08:50:44         49 File2.jpg
2015-07-23 08:50:49         49 File4.jpg
2015-07-23 08:50:53         49 File5.jpg
2015-07-23 08:50:20         24 File6.jpg

aws s3 ls s3://bucketb
2015-07-23 08:49:35         24 File1.jpg
2015-07-23 08:49:35         24 File2.jpg
2015-07-23 08:49:36         24 File3.jpg
2015-07-23 08:49:36         24 File4.jpg
2015-07-23 08:49:36         24 File5.jpg 

aws s3 sync s3://bucketa s3://bucketb --exclude "*" --include "File1.jpg" --include "File2.jpg" --include "File4.jpg"
like image 125
John Rotenstein Avatar answered Sep 21 '22 12:09

John Rotenstein