For some reason there's a bucket with a bunch of different files, all of which have the same prefix but with different dates:
backup.2017-01-01aa
backup.2017-01-01ab
backup.2017-01-15aa
backup.2017-01-15ab
backup.2017-02-01aa
backup.2017-02-01ab
etc..
How do I download only files that start with "backup.2017-01-01"?
I think --include
does the filtering locally. So if your bucket contains millions of files, the command can take hours to run, because it needs to download a list of all the filenames in the bucket. Also, some extra network traffic.
But aws s3 ls
can take a truncated filename to list all the corresponding files, without any extra traffic. So you can
aws s3 ls s3://yourbucket/backup.2017-
to see your files, and something like
aws s3 ls s3://yourbucket/backup.2017- | colrm 1 31 | xargs -I % aws s3 cp s3://yourbucket/% .
to copy your files.
You'll have to use aws s3 sync s3://yourbucket/
There are two parameters you can give to aws s3 sync; --exclude and --include, both of which can take the "*" wildcard.
First we'll have to --exclude "*"
to exclude all of the files, and then we'll --include "backup.2017-01-01*"
to include all the files we want with the specific prefix. Obviously you can change the include around so you could also do something like --include "*-01-01*"
.
That's it, here's the full command:
aws s3 sync s3://yourbucket/ . --exclude "*" --include "backup.2017-01-01*"
Also, remember to use --dryrun
to test your command and avoid downloading all files in the bucket.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With