Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS S3 sync between buckets overwriting newer destination files

We have two s3 buckets, and we have a sync cron job that should copy bucket1 changes to bucket2.

aws s3 sync s3://bucket1/images/ s3://bucket2/images/

When a new image is added to bucket1, it correctly gets copied over to bucket2.

However, if we upload a new version of that image to bucket2, when the sync job next runs it actually copies the older version from bucket1 over to bucket2, replacing the newer version we just put there.

This is part of a migration process, and in time the only place images will be uploaded to will be bucket2, but for the time being sometimes they may be uploaded to either, and we only want changes form bucket1 to be copied up to bucket2, NOT the other way round.

Why does the aws sync job seem to think that the file on bucket1 has changed? Does it not know that the file in bucket2 is newer, so it should be left alone?

like image 890
northernMonkey Avatar asked Jul 12 '17 09:07

northernMonkey


People also ask

Does aws S3 sync overwrite files?

The AWS Command-Line Interface (CLI) aws s3 sync command copies content from the Source location to the Destination location. It only copies files that have been added or changed since the last sync. It is designed as a one-way sync, not a two-way sync.

Does AWS sync replace files?

Syncs directories and S3 prefixes. Recursively copies new and updated files from the source directory to the destination. Only creates folders in the destination if they contain one or more files.

Is aws S3 sync incremental?

It is based on aws s3 sync , which should enable an incremental backup, instead of copying/modifying every files.

How do I transfer files between S3 buckets in two accounts?

Open AWS CLI and run the copy command from the Code section to copy the data from the source S3 bucket. Run the synchronize command from the Code section to transfer the data into your destination S3 bucket. Your data is then copied from the source S3 bucket to the destination S3 bucket.


1 Answers

The AWS Command-Line Interface (CLI) aws s3 sync command copies content from the Source location to the Destination location. It only copies files that have been added or changed since the last sync.

It is designed as a one-way sync, not a two-way sync. Your file is being overwritten because the file in the Source is not present in the Destination. This is correct behavior.

There is limited range to tweak these controls, such as (from the sync command documentation):

--exact-timestamps (boolean) When syncing from S3 to local, same-sized items will be ignored only when the timestamps match exactly. The default behavior is to ignore same-sized items unless the local version is newer than the S3 version.

However, there does not appear to be an option that stops overwriting of files merely because a file with the same name exists, or something with a preference to keep newer files.

If you want a two-way sync with more specific rules, you will need to code it yourself.

like image 79
John Rotenstein Avatar answered Nov 14 '22 02:11

John Rotenstein