Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Could we iterate over the complete set of objects in Amazon S3

Tags:

java

amazon-s3

I have tried to print the metadata of all the objects in S3 bucket. However, it does not return the results of more than 1000 objects. I have tried implementing the objectListing.isTruncated() and it did not help. Here is a sample code of what I did to list more than 1000 objects.

 ListObjectsRequest listObjectsRequest = new ListObjectsRequest()
            .withBucketName(bucketName);
    ObjectListing objectListing;
    do {
        objectListing = s3client.listObjects(listObjectsRequest);
        for (S3ObjectSummary objectSummary :
                objectListing.getObjectSummaries()) {
            System.out.println( " - " + objectSummary.getKey() + "  " +
                    "(size = " + objectSummary.getSize() +
                    ")");

            listObjectsRequest.setMarker(objectListing.getNextMarker());
        }
        listObjectsRequest.setMarker(objectListing.getNextMarker());
    } while (objectListing.isTruncated());
like image 851
ZZzzZZzz Avatar asked Jul 31 '15 19:07

ZZzzZZzz


People also ask

How many objects can an S3 bucket hold?

The total volume of data and number of objects you can store are unlimited. Individual Amazon S3 objects can range in size from a minimum of 0 bytes to a maximum of 5 TB. The largest object that can be uploaded in a single PUT is 5 GB.

How many tags can an S3 object have?

You can associate up to 10 tags with an object. Tags associated with an object must have unique tag keys.


2 Answers

For all those who read this in 2018+. There is a new API in Java SDK that allows you to iterate through objects in S3 bucket very easy without hustling with pagination:

AmazonS3 s3 = AmazonS3ClientBuilder.standard().build();

S3Objects.inBucket(s3, "bucket").forEach((S3ObjectSummary objectSummary) -> {
    // TODO: Consume `objectSummary` the way you need
    // System.out.println(objectSummary.key);
});
like image 124
madhead - StandWithUkraine Avatar answered Oct 31 '22 06:10

madhead - StandWithUkraine


This solved my problem. I had setup a marker and truncated my list and was able to print all the objects (more than 1000).

 ListObjectsRequest listObjectsRequest = new ListObjectsRequest()
     .withBucketName(bucketName);
 ObjectListing objectListing;
 do {
     objectListing = s3.listObjects(listObjectsRequest);
     System.out.println("Enter the path where to save yout file");
     Scanner scan = new Scanner(System.in);
     String path = scan.nextLine();
     fileOne = new File(path);
     fw = new FileWriter(fileOne.getAbsoluteFile(), true);
     bw = new BufferedWriter(fw);
     bw.write("Writing data to file");
     bw.write("\n");
     for (S3ObjectSummary objectSummary: objectListing.getObjectSummaries()) {
         String key = objectSummary.getKey();
         String dummyKey = key.substring(2);
         if (dummyKey.equalsIgnoreCase("somestring")) {
             S3Object s3object = s3.getObject(new GetObjectRequest(bucketName, key));
             BufferedReader reader = new BufferedReader(new InputStreamReader(s3object.getObjectContent()));
             String line;
             int i = 0;
             while ((line = reader.readLine()) != null) {
                 if (i > 0) {
                     bw.append(line + "," + s3object.getKey().substring(0, 2));
                     bw.append(objectSummary.getLastModified().toString());
                     bw.newLine();
                 }
                 i++;
                 System.out.println(line);
             }
         }
         //                    bw.close();
     }
     listObjectsRequest.setMarker(objectListing.getNextMarker());
 } while (objectListing.isTruncated());
like image 44
ZZzzZZzz Avatar answered Oct 31 '22 05:10

ZZzzZZzz