Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to access a file on Amazon S3 from the Command Line?

Question:

Is there a simple way to access a data file stored on Amazon S3 directly from the command line?

Motivation:

I'm loosely following an online tutorial where the author links to the following URL:

s3://bml-data/churn-bigml-80.csv

It is a simple csv file, but I can't open it using my web browser, or with curl. The tutorial opens it with BigML, but I want to download the data for myself. Some googling tells me that there are a number of python and Scala libraries designed for S3 access ... but it would be really nice to open or download the file more directly.

I use Mac and am a big fan of homebrew, so the perfect solution (for me) would work on this system.

Bonus Question:

Is there any good way to see the contents of an Amazon E3 bucket (that I don't own)?

The nature of the file (80% of a particular data-set) makes me suspect that there may be a churn-bigml-20.csv file hiding somewhere out there. My automatic approach would be to try and curl / open the expected file ... the solution to the first question will allow me to check this hunch but in an ugly way. If anyone knows of a way to remotely explore the contents of a specific S3 bucket, then that would be very useful. Again, exploring google and SO tells me that there are libraries for this, but a more direct approach would be useful.

like image 289
GnomeDePlume Avatar asked Nov 25 '14 15:11

GnomeDePlume


People also ask

How do I access my Amazon S3 files?

You can also download the object to your local computer. In the Amazon S3 console, choose your S3 bucket, choose the file that you want to open or download, choose Actions, and then choose Open or Download. If you are downloading an object, specify where you want to save it.

How do I browse files on S3 bucket?

In AWS Explorer, expand the Amazon S3 node, and double-click a bucket or open the context (right-click) menu for the bucket and choose Browse. In the Browse view of your bucket, choose Upload File or Upload Folder. In the File-Open dialog box, navigate to the files to upload, choose them, and then choose Open.

How do I download from S3 bucket to local using command line?

You can use cp to copy the files from an s3 bucket to your local system. Use the following command: $ aws s3 cp s3://bucket/folder/file.txt .

How do I connect to a S3 bucket in Unix?

Log in to the AWS Console using either root account or IAM user and then expand Services. You can see S3 listed in the Storage group as shown below. Click on S3, and it launches the S3 console. Here, you see an existing bucket (if any) and options to create a new bucket.


2 Answers

The AWS Command Line Interface (CLI) is a unified tool to manage AWS services, including accessing data stored in Amazon S3.

The AWS Command Line Interface is available for Windows, Mac and Linux.

If the bucket owner has granted public permissions for ListBucket, then you can list the contents of the bucket, eg:

aws s3 ls s3://bml-data

If the bucket owner has granted public permissions for GetObject, then you can copy an object:

aws s3 cp s3://bml-data/churn-bigml-80.csv churn-bigml-80.csv

Both of these commands works successfully for me.

See also:

  • AWS Command Line Interface Documentation
like image 65
John Rotenstein Avatar answered Oct 20 '22 18:10

John Rotenstein


There's a neat tool called s3cmd that will do this.

  • It works on Mac (with the homebrew package manager)
  • It lets you download from Amazon S3 to your local machine
  • It lets you browse Amazon S3 buckets (even when you don't own them)

Installation and Setup

brew install s3cmd

Configuring the s3cmd requires that you have an amazon s3 account. This is free, but you need to sign up for it here.

s3cmd --configure

Configuration involves specifying your access / secret key pair, and a few other details (I used defaults for everything). If you want to use HTTPS then you can install gpg with brew, and set a few more configuration options at this point. Be warned - the gpg_passphrase that you use is stored in a local plain-text configuration file!

Use:

Now for the exciting bit: downloading my file to desktop!

s3cmd get s3://bml-data/churn-bigml-80.csv ~/Desktop

Listing the contents of the remote bucket:

s3cmd ls s3://bml-data/

Additional Functionality:

This is beyond the scope of the question but seems worth mentioning: s3cmd can do other things like put data into the bucket (and make it public with the -P flag), delete files, and show the manual for more information:

s3cmd -P put ~/Desktop/my-file.png  s3://mybucket/
s3cmd del s3://mybucket/my-file-to-delete.png
man s3cmd

Credit:

Thanks to Neil Gee for his tutorial on s3cmd.

like image 4
GnomeDePlume Avatar answered Oct 20 '22 18:10

GnomeDePlume