Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to download data from Amazon's requester pay buckets?

I have been struggling for about a week to download arXiv articles as mentioned here: http://arxiv.org/help/bulk_data_s3#src.

I have tried lots of things: s3Browser, s3cmd. I am able to login to my buckets but I am unable to download data from arXiv bucket.

I tried:

  1. s3cmd get s3://arxiv/pdf/arXiv_pdf_1001_001.tar

See:

$ s3cmd get s3://arxiv/pdf/arXiv_pdf_1001_001.tar


s3://arxiv/pdf/arXiv_pdf_1001_001.tar -> ./arXiv_pdf_1001_001.tar  [1 of 1]
s3://arxiv/pdf/arXiv_pdf_1001_001.tar -> ./arXiv_pdf_1001_001.tar  [1 of 1]
ERROR: S3 error: Unknown error
  1. s3cmd get with x-amz-request-payer:requester

It gave me same error again:

$ s3cmd get --add-header="x-amz-request-payer:requester" s3://arxiv/pdf/arXiv_pdf_manifest.xml
s3://arxiv/pdf/arXiv_pdf_manifest.xml -> ./arXiv_pdf_manifest.xml  [1 of 1]
s3://arxiv/pdf/arXiv_pdf_manifest.xml -> ./arXiv_pdf_manifest.xml  [1 of 1]
ERROR: S3 error: Unknown error
  1. Copying

I have tried copying files from that folder too.

$ aws s3 cp s3://arxiv/pdf/arXiv_pdf_1001_001.tar .

A client error (403) occurred when calling the HeadObject operation: Forbidden
Completed 1 part(s) with ... file(s) remaining

This probably means that I made a mistake. The problem is I don't know how and what to add that will convey my permission to pay for download.

I am unable to figure out what should I do for downloading data from S3. I have been reading a lot on AWS sites, but nowhere I can get pinpoint solution to my problem.

How can I bulk download the arXiv data?

like image 369
pg2455 Avatar asked Feb 28 '15 17:02

pg2455


People also ask

How do I download AWS bucket data?

In the Buckets list, choose the name of the bucket that you want to download an object from. You can download an object from an S3 bucket in any of the following ways: Select the object and choose Download or choose Download as from the Actions menu if you want to download the object to a specific folder.

How do I download all items from S3 bucket?

To download an entire bucket to your local file system, use the AWS CLI sync command, passing it the s3 bucket as a source and a directory on your file system as a destination, e.g. aws s3 sync s3://YOUR_BUCKET . . The sync command recursively copies the contents of the source to the destination.

How do I download from S3 bucket to AWS CLI?

You can use cp to copy the files from an s3 bucket to your local system. Use the following command: $ aws s3 cp s3://bucket/folder/file.txt . To know more about AWS S3 and its features in detail check this out!

Which of these occurs when requester pays is enabled on an S3 bucket?

If you enable Requester Pays on a bucket, anonymous access to that bucket is not allowed. You must authenticate all requests involving Requester Pays buckets. The request authentication enables Amazon S3 to identify and charge the requester for their use of the Requester Pays bucket.


1 Answers

For me the problem was that my IAM user didn't have enough permissions. Setting AmazonS3FullAccess was the solution for me.

Hope it'll save time to someone

like image 90
Alan Wagner Avatar answered Sep 23 '22 20:09

Alan Wagner