Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Retain owner and file permissions info when syncing to AWS S3 Bucket from Linux

I am syncing a directory to AWS S3 from a Linux server for backup.

rsync -a --exclude 'cache' /path/live /path/backup
aws s3 sync  path/backup s3://myBucket/backup --delete

However, I noticed that when I want to restore a backup like so:

aws s3 sync s3://myBucket/backup path/live/ --delete

The owner and file permissions are different. Is there anything I can do or change in the code to retain the original Linux information of the files?

Thanks!

like image 200
Yevgen Avatar asked Dec 25 '16 16:12

Yevgen


People also ask

What permissions are needed for S3 sync?

To run the command aws s3 sync, then you need permission to s3:GetObject, s3:PutObject, and s3:ListBucket. Note: If you're using the AssumeRole API operation to access Amazon S3, you must also verify that the trust relationship is configured correctly.

Does S3 retain file permissions?

S3 isn't a Linux file system. It won't retain any Linux permissions because they don't apply to S3. You could try creating a tar file and copying that to S3, which would retain permission information, but that wouldn't be an incremental sync anymore. Stack Overflow is a site for programming and development questions.

What is -- ACL bucket owner full control?

By default, in a cross-account scenario where other AWS accounts upload objects to your Amazon S3 bucket, the objects remain owned by the uploading account. When the bucket-owner-full-control ACL is added, the bucket owner has full control over any new objects that are written by other accounts.


1 Answers

I stumbled on this question while looking for something else and figured you (or someone) might like to know you can use other tools that can preserve original (Linux) ownership information. There must be others but I know that s3cmd can keep the ownership information (stored in the metadata of the object in the bucket) and restore it if you sync it back to a Linux box.

The syntax for syncing is as follows

/usr/bin/s3cmd --recursive --preserve sync /path/ s3://mybucket/path/

And you can sync it back with the same command just reversing the from/to.

But, as you might know (if you did a little research on S3 costs optimisation), depending on the situation, it could be wiser to use a compressed file. It saves space and it should take less requests so you could end up with some savings at the end of the month.

Also, s3cmd is not the fastest tool to synchronise with S3 as it does not use multi-threading (and is not planning to) like other tools, so you might want to look for other tools that could preserve ownership and profits of multi-threading if that's still what you're looking for. To speedup data transfer with s3cmd, you could execute multiple s3cmd with different --exclude --include statements.

For example

/usr/bin/s3cmd --recursive --preserve --exclude="*" --include="a*" sync /path/ s3://mybucket/path/ & \
/usr/bin/s3cmd --recursive --preserve --exclude="*" --include="b*" sync /path/ s3://mybucket/path/ & \
/usr/bin/s3cmd --recursive --preserve --exclude="*" --include="c*" sync /path/ s3://mybucket/path/
like image 158
Serge Avatar answered Sep 16 '22 21:09

Serge