Quick way to list all files in Amazon S3 bucket?

Tags:

amazon-s3

People also ask

How do I view contents of a S3 bucket?

To open the overview pane for an objectSign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/ . In the Buckets list, choose the name of the bucket that contains the object. In the Objects list, choose the name of the object for which you want an overview.

How do you list items in a bucket?

To get a list of objects within a bucket, use the AmazonS3 client's listObjects method, supplying the name of a bucket. The listObjects method returns an ObjectListing object that provides information about the objects in the bucket.

What CLI command will list all of the S3 buckets you have access to?

Description. Returns a list of all buckets owned by the authenticated sender of the request. To use this operation, you must have the s3:ListAllMyBuckets permission. See 'aws help' for descriptions of global parameters.

How can you download an S3 bucket including all folders and files?

To download an entire bucket to your local file system, use the AWS CLI sync command, passing it the s3 bucket as a source and a directory on your file system as a destination, e.g. aws s3 sync s3://YOUR_BUCKET . . The sync command recursively copies the contents of the source to the destination.

I'd recommend using boto. Then it's a quick couple of lines of python:

from boto.s3.connection import S3Connection

conn = S3Connection('access-key','secret-access-key')
bucket = conn.get_bucket('bucket')
for key in bucket.list():
    print(key.name.encode('utf-8'))

Save this as list.py, open a terminal, and then run:

$ python list.py > results.txt

AWS CLI

Documentation for aws s3 ls

AWS have recently release their Command Line Tools. This works much like boto and can be installed using sudo easy_install awscli or sudo pip install awscli

Once you have installed, you can then simply run

aws s3 ls

Which will show you all of your available buckets

CreationTime Bucket
       ------------ ------
2013-07-11 17:08:50 mybucket
2013-07-24 14:55:44 mybucket2

You can then query a specific bucket for files.

Command:

aws s3 ls s3://mybucket

Output:

Bucket: mybucket
Prefix:

      LastWriteTime     Length Name
      -------------     ------ ----
                           PRE somePrefix/
2013-07-25 17:06:27         88 test.txt

This will show you all of your files.

s3cmd is invaluable for this kind of thing

$ s3cmd ls -r s3://yourbucket/ | awk '{print $4}' > objects_in_bucket

Be carefull, amazon list only returns 1000 files. If you want to iterate over all files you have to paginate the results using markers :

In ruby using aws-s3

bucket_name = 'yourBucket'
marker = ""

AWS::S3::Base.establish_connection!(
  :access_key_id => 'your_access_key_id',
  :secret_access_key => 'your_secret_access_key'
)

loop do
  objects = Bucket.objects(bucket_name, :marker=>marker, :max_keys=>1000)
  break if objects.size == 0
  marker = objects.last.key

  objects.each do |obj|
      puts "#{obj.key}"
  end
end

end

Hope this helps, vincent

Update 15-02-2019:

This command will give you a list of all buckets in AWS S3:

aws s3 ls

This command will give you a list of all top-level objects inside an AWS S3 bucket:

aws s3 ls bucket-name

This command will give you a list of ALL objects inside an AWS S3 bucket:

aws s3 ls bucket-name --recursive

This command will place a list of ALL inside an AWS S3 bucket... inside a text file in your current directory:

aws s3 ls bucket-name --recursive | cat >> file-name.txt

There are couple of ways you can go about it. Using Python

import boto3

sesssion = boto3.Session(aws_access_key_id, aws_secret_access_key)

s3 = sesssion.resource('s3')

bucketName = 'testbucket133'
bucket = s3.Bucket(bucketName)

for obj in bucket.objects.all():
    print(obj.key)

Another way is using AWS cli for it

aws s3 ls s3://{bucketname}
example : aws s3 ls s3://testbucket133

For Scala developers, here it is recursive function to execute a full scan and map the contents of an AmazonS3 bucket using the official AWS SDK for Java

import com.amazonaws.services.s3.AmazonS3Client
import com.amazonaws.services.s3.model.{S3ObjectSummary, ObjectListing, GetObjectRequest}
import scala.collection.JavaConversions.{collectionAsScalaIterable => asScala}

def map[T](s3: AmazonS3Client, bucket: String, prefix: String)(f: (S3ObjectSummary) => T) = {

  def scan(acc:List[T], listing:ObjectListing): List[T] = {
    val summaries = asScala[S3ObjectSummary](listing.getObjectSummaries())
    val mapped = (for (summary <- summaries) yield f(summary)).toList

    if (!listing.isTruncated) mapped.toList
    else scan(acc ::: mapped, s3.listNextBatchOfObjects(listing))
  }

  scan(List(), s3.listObjects(bucket, prefix))
}

To invoke the above curried map() function, simply pass the already constructed (and properly initialized) AmazonS3Client object (refer to the official AWS SDK for Java API Reference), the bucket name and the prefix name in the first parameter list. Also pass the function f() you want to apply to map each object summary in the second parameter list.

For example

val keyOwnerTuples = map(s3, bucket, prefix)(s => (s.getKey, s.getOwner))

will return the full list of (key, owner) tuples in that bucket/prefix

map(s3, "bucket", "prefix")(s => println(s))

as you would normally approach by Monads in Functional Programming

Related questions
                            
                                Amazon S3 CORS (Cross-Origin Resource Sharing) and Firefox cross-domain font loading
                            
                                S3 Error: The difference between the request time and the current time is too large
                            
                                How do API Keys and Secret Keys work? Would it be secure if I have to pass my API and secret keys to another application?
                            
                                Amazon S3 boto - how to create a folder?
                            
                                AWS S3 CLI - Could not connect to the endpoint URL
                            
                                Nodejs AWS SDK S3 Generate Presigned URL
                            
                                How to specify credentials when connecting to boto3 S3?
                            
                                How to Configure SSL for Amazon S3 bucket
                            
                                Getting Access Denied when calling the PutObject operation with bucket-level permission
                            
                                How to write a file or data to an S3 object using boto3
                            
                                Technically what is the difference between s3n, s3a and s3?
                            
                                Set cache-control for entire S3 bucket automatically (using bucket policies?)
                            
                                How to upgrade AWS CLI to the latest version?
                            
                                The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256
                            
                                Downloading folders from aws s3, cp or sync?
                            
                                FTP/SFTP access to an Amazon S3 Bucket [closed]
                            
                                How can I use wildcards to `cp` a group of files with the AWS CLI
                            
                                Font from origin has been blocked from loading by Cross-Origin Resource Sharing policy
                            
                                Amazon S3 - How to fix 'The request signature we calculated does not match the signature' error?
                            
                                Save Dataframe to csv directly to s3 Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With