Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Quickly finding the size of an S3 'folder'

We have s3 'folders' (objects with a prefix under a bucket) with millions and millions of files and we want to figure out the size of these folders.

Writing my own .net application to get the lists of s3 objects was easy enough but the maximum number of keys per request is 1000, so it's taking forever.

Using S3Browser to look at a 'folder's' properties is taking a long time too. I'm guessing for the same reasons.

I've had this .NET application running for a week - I need a better solution.

Is there a faster way to do this?

like image 398
b15 Avatar asked Apr 29 '15 19:04

b15


People also ask

What is the size of S3 object?

Individual Amazon S3 objects can range in size from a minimum of 0 bytes to a maximum of 5 TB. The largest object that can be uploaded in a single PUT is 5 GB.

How do I view an S3 bucket file?

In the Amazon S3 console, choose your S3 bucket, choose the file that you want to open or download, choose Actions, and then choose Open or Download. If you are downloading an object, specify where you want to save it. The procedure for saving the object depends on the browser and operating system that you are using.

What does S3 Getobject return?

PDF. Retrieves objects from Amazon S3. To use GET , you must have READ access to the object. If you grant READ access to the anonymous user, you can return the object without using an authorization header.

What is the maximum size of a single S3 bucket?

S3 provides unlimited scalability, and there is no official limit on the amount of data and number of objects you can store in an S3 bucket. The size limit for objects stored in a bucket is 5 TB.


3 Answers

The AWS CLI's ls command can do this: aws s3 ls --summarize --human-readable --recursive s3://$BUCKETNAME/$PREFIX --region $REGION

like image 82
Foolish Brilliance Avatar answered Oct 14 '22 15:10

Foolish Brilliance


Seems like AWS added a menu item where it's possible to see the size:

size of S3 folder

like image 28
Filippo Loddo Avatar answered Oct 14 '22 15:10

Filippo Loddo


I prefer using the AWSCLI. I find that the web console often times out when there are too many objects.

  • replace s3://bucket/ with where you want to start from.
  • relies on awscli, awk, tail, and some bash-like shell
start=s3://bucket/ && \
for prefix in `aws s3 ls $start | awk '{print $2}'`; do
  echo ">>> $prefix <<<"
  aws s3 ls $start$prefix --recursive --summarize | tail -n2
done

or in one line form:

start=s3://bucket/ && for prefix in `aws s3 ls $start | awk '{print $2}'`; do echo ">>> $prefix <<<"; aws s3 ls $start$prefix --recursive --summarize | tail -n2; done

Output looks something like:

$ start=s3://bucket/ && for prefix in `aws s3 ls $start | awk '{print $2}'`; do echo ">>> $prefix <<<"; aws s3 ls $start$prefix --recursive --summarize | tail -n2; done
>>> extracts/ <<<
Total Objects: 23
   Total Size: 10633858646
>>> hackathon/ <<<
Total Objects: 2
   Total Size: 10004
>>> home/ <<<
Total Objects: 102
   Total Size: 1421736087
like image 44
debugme Avatar answered Oct 14 '22 14:10

debugme