How to exclude multiple folders while using aws s3 syn ?
I tried :
# aws s3 sync s3://inksedge-app-file-storage-bucket-prod-env \ s3://inksedge-app-file-storage-bucket-test-env \ --exclude 'reportTemplate/* orders/* customers/*'
But still it's doing sync for folder "customer"
Output :
copy: s3://inksedge-app-file-storage-bucket-prod-env/customers/116/miniimages/IMG_4800.jpg to s3://inksedge-app-file-storage-bucket-test-env/customers/116/miniimages/IMG_4800.jpg copy: s3://inksedge-app-file-storage-bucket-prod-env/customers/116/miniimages/DSC_0358.JPG to s3://inksedge-app-file-storage-bucket-test-env/customers/116/miniimages/DSC_0358.JPG
Syncs directories and S3 prefixes. Recursively copies new and updated files from the source directory to the destination. Only creates folders in the destination if they contain one or more files.
It only copies files that have been added or changed since the last sync. It is designed as a one-way sync, not a two-way sync. Your file is being overwritten because the file in the Source is not present in the Destination. This is correct behavior.
The "_$folder$" files are placeholders. Apache Hadoop creates these files when you use the -mkdir command to create a folder in an S3 bucket. Hadoop doesn't create the folder until you PUT the first object. If you delete the "_$folder$" files before you PUT at least one object, Hadoop can't create the folder.
Using "folders" has no performance impact on S3, either way. It doesn't make it faster, and it doesn't make it slower. The value of delimiting your object keys with / is in organization, both machine-friendly and human-friendly.
At last this worked for me:
aws s3 sync s3://my-bucket s3://my-other-bucket \ --exclude 'customers/*' \ --exclude 'orders/*' \ --exclude 'reportTemplate/*'
Hint: you have to enclose your wildcards and special characters in single or double quotes to work properly. Below are examples of matching characters. for more information regarding S3 commands, check it in amazon here.
*: Matches everything ?: Matches any single character [sequence]: Matches any character in sequence [!sequence]: Matches any character not in sequence
For those who are looking for sync some subfolder in a bucket, the exclude filter applies to the files and folders inside the folder that is be syncing, and not the path with respect to the bucket, example:
aws s3 sync s3://bucket1/bootstrap/ s3://bucket2/bootstrap --exclude '*' --include 'css/*'
would sync the folder bootstrap/css but not bootstrap/js neither bootstrap/fonts in the following folder tree:
bootstrap/ ├── css/ │ ├── bootstrap.css │ ├── bootstrap.min.css │ ├── bootstrap-theme.css │ └── bootstrap-theme.min.css ├── js/ │ ├── bootstrap.js │ └── bootstrap.min.js └── fonts/ ├── glyphicons-halflings-regular.eot ├── glyphicons-halflings-regular.svg ├── glyphicons-halflings-regular.ttf └── glyphicons-halflings-regular.woff
That is, the filter is 'css/*' and not 'bootstrap/css/*'
More in https://docs.aws.amazon.com/cli/latest/reference/s3/index.html#use-of-exclude-and-include-filters
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With