We have s3 'folders' (objects with a prefix under a bucket) with millions and millions of files and we want to figure out the size of these folders. Writing my own .net application to get the lists of s3 objects was easy enough but the maximum number of keys per request is 1000, so it's taking forever. Using S3Browser to look at a 'folder's' properties is taking a long time too. I'm guessing for the same reasons. I've had this .NET application running for a week - I need a better solution. Is there a faster way to do this?

The AWS CLI's <code>ls</code> command can do this: <code>aws s3 ls --summarize --human-readable --recursive s3://$BUCKETNAME/$PREFIX --region $REGION</code>

Seems like AWS added a menu item where it's possible to see the size: <img src="https://i.stack.imgur.com/XwSNw.png" alt="size of S3 folder">

I prefer using the AWSCLI. I find that the web console often times out when there are too many objects. <ul> <li>replace s3://bucket/ with where you want to start from.</li> <li>relies on awscli, awk, tail, and some bash-like shell</li> </ul> <pre class="prettyprint lang-sh prettyprint-override"><code>start=s3://bucket/ && \ for prefix in `aws s3 ls $start | awk '{print $2}'`; do echo ">>> $prefix <<<" aws s3 ls $start$prefix --recursive --summarize | tail -n2 done </code></pre> or in one line form: <pre class="prettyprint lang-sh prettyprint-override"><code>start=s3://bucket/ && for prefix in `aws s3 ls $start | awk '{print $2}'`; do echo ">>> $prefix <<<"; aws s3 ls $start$prefix --recursive --summarize | tail -n2; done </code></pre> Output looks something like: <pre class="prettyprint lang-sh prettyprint-override"><code>$ start=s3://bucket/ && for prefix in `aws s3 ls $start | awk '{print $2}'`; do echo ">>> $prefix <<<"; aws s3 ls $start$prefix --recursive --summarize | tail -n2; done >>> extracts/ <<< Total Objects: 23 Total Size: 10633858646 >>> hackathon/ <<< Total Objects: 2 Total Size: 10004 >>> home/ <<< Total Objects: 102 Total Size: 1421736087 </code></pre>

Quickly finding the size of an S3 'folder'

3 Answers

The AWS CLI's ls command can do this: aws s3 ls --summarize --human-readable --recursive s3://$BUCKETNAME/$PREFIX --region $REGION

answered Oct 14 '22 15:10

Foolish Brilliance

Seems like AWS added a menu item where it's possible to see the size:

size of S3 folder

answered Oct 14 '22 15:10

Filippo Loddo

I prefer using the AWSCLI. I find that the web console often times out when there are too many objects.

replace s3://bucket/ with where you want to start from.
relies on awscli, awk, tail, and some bash-like shell

start=s3://bucket/ && \
for prefix in `aws s3 ls $start | awk '{print $2}'`; do
  echo ">>> $prefix <<<"
  aws s3 ls $start$prefix --recursive --summarize | tail -n2
done

or in one line form:

start=s3://bucket/ && for prefix in `aws s3 ls $start | awk '{print $2}'`; do echo ">>> $prefix <<<"; aws s3 ls $start$prefix --recursive --summarize | tail -n2; done

Output looks something like:

$ start=s3://bucket/ && for prefix in `aws s3 ls $start | awk '{print $2}'`; do echo ">>> $prefix <<<"; aws s3 ls $start$prefix --recursive --summarize | tail -n2; done
>>> extracts/ <<<
Total Objects: 23
   Total Size: 10633858646
>>> hackathon/ <<<
Total Objects: 2
   Total Size: 10004
>>> home/ <<<
Total Objects: 102
   Total Size: 1421736087

answered Oct 14 '22 14:10

debugme

Related questions
                            
                                Export existing CodePipeline/CodeBuild projects to Cloudformation
                            
                                How to automatically restart delayed_job when deploying a rails project on Amazon Elastic Beanstalk?
                            
                                Routing messages from Amazon SNS to SQS with filtering
                            
                                How to download a file from s3 using provided url?
                            
                                How to get Elastic Container Repository URI from Cloud Formation?
                            
                                How to solve 502 Bad Gateway errors with Elastic Load Balancer and EC2/Nginx for HTTPS requests?
                            
                                How does Auto Scaling "place" instances when used with multiple availability zones?
                            
                                How to rename a file in Amazon S3 Bucket? [duplicate]
                            
                                Sort by date with jmespath
                            
                                Terraform state locking using DynamoDB
                            
                                "message" : "Internal server error" issue with Lambda/API Gateway and iOS
                            
                                Unable to update AWS S3 CORS POLICY
                            
                                Setting NODE_ENV variable in elasticbeanstalk
                            
                                Serverless deploy - Function not found - sls deploy
                            
                                How can I update files on Amazon's CDN (CloudFront)?
                            
                                Can I update Amazon's old versions of pip and setuptools?
                            
                                How to find a file in Amazon S3 bucket without knowing the containing folder
                            
                                Get object from AWS S3 as a stream
                            
                                Forgot password link from aws cognito
                            
                                Cannot run `source` in AWS Codebuild

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Quickly finding the size of an S3 'folder'

Tags:

amazon-web-services

amazon-s3

aws-sdk

b15

People also ask

3 Answers

Foolish Brilliance

Filippo Loddo

debugme

Recent Activity

Donate For Us