Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

uploading and compressing file to s3

Tags:

I've recently started working with S3 and have come across this need to upload and compress large files (10 gb +-) to s3. The current implementation I'm working with is creating a temp compressed file localy and then uploading it to s3 and finally deleting the temp file. The thing is, for a 10 gb file, i have almost 20 gb localy stored until the upload is done. I need a way to transfer the file to s3 and then compress it there. Is this approach viable? If yes,how should i be adressing it? If not, is there any way i can minimize the local space needed? I've seen someone sugesting that the file coud be uploaded to the S3, downloaded to an EC2 in the same region, compressed there and then uploaded back to the S3 while deleting the first copy on S3. This might work but it seems to me that 2 uploads for geting one file up wouldn`t be an advantage costwise.

I've tried to upload a compression stream without success but I`ve just discovered s3 does not support compression streaming and now I am clueless as how to proceed.

I'm using the gzip library on .NET

like image 554
VmLino Avatar asked Jun 05 '14 18:06

VmLino


People also ask

Can I upload a zip file to S3?

S3 is just storage. Whatever file you upload is the file that is stored. You cannot upload a zip file then extract it once its in S3.

Does S3 compress files?

You can decompress the data when you read data from Amazon S3 or compress data when you write data to Amazon S3.

What are the steps of uploading a file in S3 bucket?

To upload folders and files to an S3 bucketSign in to the AWS Management Console and open the Amazon S3 console at https://console.aws.amazon.com/s3/ . In the Buckets list, choose the name of the bucket that you want to upload your folders or files to. Choose Upload.

Does S3 automatically compress data?

Save this answer. Show activity on this post. S3 does not support stream compression nor is it possible to compress the uploaded file remotely. If this is a one-time process I suggest downloading it to a EC2 machine in the same region, compress it there, then upload to your destination.


1 Answers

In the linux shell, via aws-cli, this was added about 3 months after you asked the question :-)

Added the ability to stream data using cp

So the best you can do, I guess, is to pipe the output of gzip to aws cli:

Upload from stdin:

gzip -c big_file | aws s3 cp - s3://bucket/folder/big_file.gz

Download to stdout:

aws s3 cp s3://bucket/folder/big_file.gz - | gunzip -c ...

like image 112
Ferdinand.kraft Avatar answered Sep 22 '22 07:09

Ferdinand.kraft