Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to read a zipped CSV file using Java inside an AWS S3 bucket?

I had a requirement where I had to read a .csv file from S3 bucket. I achieved it through

S3Object s3Obj = amazonS3Client.getObject(bucketname, fileName);
BufferedReader reader = new BufferedReader(new InputStreamReader(s3Obj.getObjectContent())); 

Now the same .csv file is in archived (zipped) form presented in AWS S3 bucket. I need to read this .csv file without unzip operations at my server-end.

Is there any documentation or API's present in AWS to read .csv file directly without unzipping it?

like image 610
mohd ilyas Avatar asked Sep 12 '25 20:09

mohd ilyas


2 Answers

You can read a zipped CSV file directly from Amazon S3 with these steps:

  1. Get the object from S3
  2. Create a ZipInputStream with the object's data
  3. Create a Reader with the ZipInputStream

Example:

AmazonS3 s3Client = AmazonS3ClientBuilder.defaultClient();  
S3Object object = s3Client.getObject("mybucket","myfile.csv.zip");  
ZipInputStream in = new ZipInputStream(object.getObjectContent());  
BufferedReader reader = new BufferedReader(new InputStreamReader(in));  

Because a zip file can contain many files within you will need to position the ZipInputStream at the beginning of each ZipEntry to read each contained file individually. (Even if your zip file contains only one file within you will need to do this once to place the ZipInputStream at the beginning of the lone ZipEntry.)

String line;
while (in.getNextEntry() != null) { // loop through each file within the zip
    while ((line = reader.readLine()) != null) { // loop through each line
        System.out.println(line);
    }
}
like image 77
Craig Wohlfeil Avatar answered Sep 15 '25 11:09

Craig Wohlfeil


If in your example s3Obj.getObjectContent() returns a ZIP compressed file stream, than something similar should work to access it.

ZipInputStream in = new ZipInputStream(s3Obj.getObjectContent());
while ((entry = in.getNextEntry()) != null) {
    System.out.printf("entry: %s%n", entry.getName());
}
in.close();
like image 42
SubOptimal Avatar answered Sep 15 '25 09:09

SubOptimal