I've recently started working with S3 and have come across this need to upload and compress large files (10 gb +-) to s3. The current implementation I'm working with is creating a temp compressed file localy and then uploading it to s3 and finally deleting the temp file. The thing is, for a 10 gb file, i have almost 20 gb localy stored until the upload is done. I need a way to transfer the file to s3 and then compress it there. Is this approach viable? If yes,how should i be adressing it? If not, is there any way i can minimize the local space needed? I've seen someone sugesting that the file coud be uploaded to the S3, downloaded to an EC2 in the same region, compressed there and then uploaded back to the S3 while deleting the first copy on S3. This might work but it seems to me that 2 uploads for geting one file up wouldn`t be an advantage costwise.
I've tried to upload a compression stream without success but I`ve just discovered s3 does not support compression streaming and now I am clueless as how to proceed.
I'm using the gzip library on .NET
S3 is just storage. Whatever file you upload is the file that is stored. You cannot upload a zip file then extract it once its in S3.
You can decompress the data when you read data from Amazon S3 or compress data when you write data to Amazon S3.
To upload folders and files to an S3 bucketSign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/ . In the Buckets list, choose the name of the bucket that you want to upload your folders or files to. Choose Upload.
Save this answer. Show activity on this post. S3 does not support stream compression nor is it possible to compress the uploaded file remotely. If this is a one-time process I suggest downloading it to a EC2 machine in the same region, compress it there, then upload to your destination.
In the linux shell, via aws-cli, this was added about 3 months after you asked the question :-)
Added the ability to stream data using cp
So the best you can do, I guess, is to pipe the output of gzip to aws cli:
Upload from stdin:
gzip -c big_file | aws s3 cp - s3://bucket/folder/big_file.gz
Download to stdout:
aws s3 cp s3://bucket/folder/big_file.gz - | gunzip -c ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With