Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Access aws s3 public bucket

I am trying to download data from one of Amazon's public buckets. Here is a description of the bucket in question

The bucket has web accessible folders for example. I would want to download say all the listed files in that folder. There will a long list of suitable tiles identified, and the goal would be to get all files in a folder in one go rather than downloading each individually from the http site.

From other StackOverflow questions I realize I need to use the REST endpoint and use a tool like the AWS CLI or Cyberduck, but I cannot get these to work as yet.

I think the issue may be authentication. I don't have an AWS account, and I was hoping to stick with guest / anonymous access. Does anyone have a good solution / tool to traverse a public bucket and grab the contents as a guest? Could a different approach using curl or wget work for this type of task?

Thanks.

like image 273
Grant Avatar asked Jul 13 '16 23:07

Grant


1 Answers

For the AWS CLI, you need to provide the --no-sign-request flag to skip signing. Example:

> aws s3 ls landsat-pds
Unable to locate credentials. You can configure credentials by running "aws configure".
> aws s3 ls landsat-pds --no-sign-request
                           PRE L8/
                           PRE landsat-pds_stats/
                           PRE runs/
                           PRE tarq/
                           PRE tarq_corrupt/
                           PRE test/
2015-01-28 10:13:53      23764 index.html
2015-04-14 10:43:22         25 robots.txt
2016-07-13 12:53:31         38 run_info.json
2016-07-13 12:53:30   23971821 scene_list.gz

To download that entire bucket into a directory, you would do something like this:

> mkdir landsat-pds
> aws s3 sync s3://landsat-pds landsat-pds --no-sign-request
like image 137
Jordon Phillips Avatar answered Oct 05 '22 12:10

Jordon Phillips