We have s3 'folders' (objects with a prefix under a bucket) with millions and millions of files and we want to figure out the size of these folders.
Writing my own .net application to get the lists of s3 objects was easy enough but the maximum number of keys per request is 1000, so it's taking forever.
Using S3Browser to look at a 'folder's' properties is taking a long time too. I'm guessing for the same reasons.
I've had this .NET application running for a week - I need a better solution.
Is there a faster way to do this?
Individual Amazon S3 objects can range in size from a minimum of 0 bytes to a maximum of 5 TB. The largest object that can be uploaded in a single PUT is 5 GB.
In the Amazon S3 console, choose your S3 bucket, choose the file that you want to open or download, choose Actions, and then choose Open or Download. If you are downloading an object, specify where you want to save it. The procedure for saving the object depends on the browser and operating system that you are using.
PDF. Retrieves objects from Amazon S3. To use GET , you must have READ access to the object. If you grant READ access to the anonymous user, you can return the object without using an authorization header.
S3 provides unlimited scalability, and there is no official limit on the amount of data and number of objects you can store in an S3 bucket. The size limit for objects stored in a bucket is 5 TB.
The AWS CLI's ls
command can do this: aws s3 ls --summarize --human-readable --recursive s3://$BUCKETNAME/$PREFIX --region $REGION
Seems like AWS added a menu item where it's possible to see the size:
I prefer using the AWSCLI. I find that the web console often times out when there are too many objects.
start=s3://bucket/ && \
for prefix in `aws s3 ls $start | awk '{print $2}'`; do
echo ">>> $prefix <<<"
aws s3 ls $start$prefix --recursive --summarize | tail -n2
done
or in one line form:
start=s3://bucket/ && for prefix in `aws s3 ls $start | awk '{print $2}'`; do echo ">>> $prefix <<<"; aws s3 ls $start$prefix --recursive --summarize | tail -n2; done
Output looks something like:
$ start=s3://bucket/ && for prefix in `aws s3 ls $start | awk '{print $2}'`; do echo ">>> $prefix <<<"; aws s3 ls $start$prefix --recursive --summarize | tail -n2; done
>>> extracts/ <<<
Total Objects: 23
Total Size: 10633858646
>>> hackathon/ <<<
Total Objects: 2
Total Size: 10004
>>> home/ <<<
Total Objects: 102
Total Size: 1421736087
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With