Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

AWS S3 ListBucketResult pagination without authentication?

I'm looking to get a simple listing of all the objects in a public S3 bucket.

I'm aware how to get a listing with curl for upto 1000 results, though I do not understand how to paginate the results, in order to get a full listing. I think marker is a clue.

I do not want to use a SDK / library or authenticate. I'm looking for a couple of lines of shell to do this.

like image 707
hendry Avatar asked Oct 20 '25 03:10

hendry


1 Answers

#!/bin/sh

# setting max-keys higher than 1000 is not effective
s3url=http://mr2011.s3-ap-southeast-1.amazonaws.com?max-keys=1000
s3ns=http://s3.amazonaws.com/doc/2006-03-01/

i=0
s3get=$s3url

while :; do
    curl -s $s3get > "listing$i.xml"
    nextkey=$(xml sel -T -N "w=$s3ns" -t \
        --if '/w:ListBucketResult/w:IsTruncated="true"' \
        -v 'str:encode-uri(/w:ListBucketResult/w:Contents[last()]/w:Key, true())' \
        -b -n "listing$i.xml")
    # -b -n adds a newline to the result unconditionally, 
    # this avoids the "no XPaths matched" message; $() drops newlines.

    if [ -n "$nextkey" ] ; then
        s3get=$s3url"&marker=$nextkey"
        i=$((i+1))
    else
        break
    fi
done
like image 120
npostavs Avatar answered Oct 22 '25 03:10

npostavs