I need to download files recursively from a s3 bucket. The s3 bucket lets anonymous access.
How to list files and download them without providing AWS Access Key using an anonymous user?
My command is:
aws s3 cp s3://anonymous@big-data-benchmark/pavlo/text/tiny/rankings/uservisits uservisit --region us-east --recursive
The aws compains that:
Unable to locate credentials. You can configure credentials by running "aws
configure"
You can use no-sign-request
option
aws s3 cp s3://anonymous@big-data-benchmark/pavlo/text/tiny/rankings/uservisits uservisit --region us-east --recursive --no-sign-request
you probably have to provide an access keys and secret key, even if you're doing anonymous access. don't see an option for anonymous for the AWS cli.
another way to do this, it to hit the http endpoint and grab the files that way.
In your case: http://big-data-benchmark.s3.amazonaws.com
You will get and XML listing all the keys in the bucket. You can extract the keys and issues requests for each. Not the fastest thing out there but it will get the job done.
For example: http://big-data-benchmark.s3.amazonaws.com/pavlo/sequence-snappy/5nodes/crawl/000741_0
for getting the files curl should be enough. for parsing the xml depending on what you like you can go as lo-level as sed and as high-level as a proper language.
hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With