Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find zero byte files in Amazon S3

Is there a way to programmatically find zero bytes file in Amazon S3?

The total size of the bucket is more than 100G,
unlikely for me to sync back to server, then do a

find . -size 0 -type f
like image 383
ajreal Avatar asked May 31 '12 11:05

ajreal


4 Answers

Combining s3cmd with awk should do the trick easily.

Note: s3cmd outputs 4 columns, date, time, size and name. You want to match the size (column 3) against 0 and output the object name (column 4). This should do the trick...

$ s3cmd ls -r s3://bucketname | awk '{if ($3 == 0) print $4}'
s3://bucketname/root/
s3://bucketname/root/e

If you want to see all information, just drop the $4 so that it only says print.

$ s3cmd ls -r s3://bucketname | awk '{if ($3 == 0) print}' 
2013-03-04 06:28         0   s3://bucketname/root/
2013-03-04 06:28         0   s3://bucketname/root/e

Memory-wise, this should be fine as it's a simple bucket listing.

like image 191
MeSee Avatar answered Oct 15 '22 09:10

MeSee


There is no direct process to search files of zero bytes in size at amazon s3. You can do it by listing all objects and then sort that items on the basis of size, then you can get all zero file size together.

if you want get list of all file having size zero then you can use Bucket Explorer and list the objects of the selected bucket then click on size header (sort by size) it will keep together files size of zero byte together.

Disclosure: I am a developer of Bucket Explorer.

like image 39
Tej Kiran Avatar answered Oct 15 '22 10:10

Tej Kiran


Just use Boto:

from boto import S3Connection
aws_access_key = ''
aws_secret_key = ''
bucket_name = ''
s3_conn = S3Connection(aws_access_key, aws_secret_key)
s3_conn.get_bucket(bucket_name)
for key in bucket.list():
    if key.size == 0:
        print(key.key)

In regards to the number files, Boto requests the file metadata (not the actual file content) at 1000 per time (the aws limit), and it uses a generator so the memory usage is minor.

like image 4
Derrick Petzold Avatar answered Oct 15 '22 09:10

Derrick Petzold


JMSE Query:

aws s3api list-objects --bucket $BUCKET --prefix $PREFIX --output json --query 'Contents[?Size==`0`]'
like image 4
Brock Ackley Avatar answered Oct 15 '22 11:10

Brock Ackley