I have this code, which writes a zip file to disk, reads it back, uploads to s3, then deletes the file:
compressed_file = some_temp_path
Zip::ZipOutputStream.open(compressed_file) do |zos|
some_file_list.each do |file|
zos.put_next_entry(file.some_title)
zos.print IO.read(file.path)
end
end # Write zip file
s3 = Aws::S3.new(S3_KEY, S3_SECRET)
bucket = Aws::S3::Bucket.create(s3, S3_BUCKET)
bucket.put("#{BUCKET_PATH}/archive.zip", IO.read(compressed_file), {}, 'authenticated-read')
File.delete(compressed_file)
This code works already but what I want is to not create the zip file anymore, to save a few steps. I was wondering if there is a way to export the zipfile data directly to s3 without having to first create a tmpfile, read it back, then delete it?
S3 is just storage. Whatever file you upload is the file that is stored. You cannot upload a zip file then extract it once its in S3.
So, if your ZIP data was stored on S3, this typically would involve downloading the ZIP file(s) to your local PC or Laptop, unzipping them with a third-party tool like WinZip, then re-uploading the unzipped data files back to S3 for further processing.
Sign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/ . In the Buckets list, choose the name of the bucket that you want to upload your folders or files to. Choose Upload.
If you head to the Properties tab of your S3 bucket, you can set up an Event Notification for all object “create” events (or just PutObject events). As the destination, you can select the Lambda function where you will write your code to unzip and gzip files.
I think I just found the answer to my question.
It's Zip::ZipOutputStream.write_buffer. I'll check this out and update this answer when I get it working.
Update
It does work. My code is like this now:
compressed_filestream = Zip::ZipOutputStream.write_buffer do |zos|
some_file_list.each do |file|
zos.put_next_entry(file.some_title)
zos.print IO.read(file.path)
end
end # Outputs zipfile as StringIO
s3 = Aws::S3.new(S3_KEY, S3_SECRET)
bucket = Aws::S3::Bucket.create(s3, S3_BUCKET)
compressed_filestream.rewind
bucket.put("#{BUCKET_PATH}/archive.zip", compressed_filestream.read, {}, 'authenticated-read')
The write_buffer
returns a StringIO and needs to rewind the stream first before read
ing it. Now I don't need to create and delete the tmpfile.
I'm just wondering now if write_buffer
would be more memory extensive or heavier than open
? Or is it the other way around?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With