Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Amazon S3 concatenate small files

Is there a way to concatenate small files which are less than 5MBs on Amazon S3. Multi-Part Upload is not ok because of small files.

It's not a efficient solution to pull down all these files and do the concatenation.

So, can anybody tell me some APIs to do these?

like image 342
jiafeng fu Avatar asked Sep 08 '15 02:09

jiafeng fu


People also ask

Is S3 good for small files?

Small Files Create Too Much Latency For Data Analytics Since streaming data comes in small files, typically you write these files to S3 rather than combine them on write. But small files impede performance.

Can we merge files on S3?

You can use one of several methods to merge or combine files from Amazon S3 inside Amazon QuickSight: Combine files by using a manifest – In this case, the files must have the same number of fields (columns). The data types must match between fields in the same position in the file.

Does S3 have a file size limit?

Individual Amazon S3 objects can range in size from a minimum of 0 bytes to a maximum of 5 TB. The largest object that can be uploaded in a single PUT is 5 GB.

What is the minimum object size for S3 standard?

128 KB minimum object size. Backed with the Amazon S3 Service Level Agreement for availability.


1 Answers

Amazon S3 does not provide a concatenate function. It is primarily an object storage service.

You will need some process that downloads the objects, combines them, then uploads them again. The most efficient way to do this would be to download the objects in parallel, to take full advantage of available bandwidth. However, that is more complex to code.

I would recommend doing the processing on "in the cloud" to avoid having to download the objects across the Internet. Doing it on Amazon EC2 or AWS Lambda would be more efficient and less costly.

like image 160
John Rotenstein Avatar answered Oct 21 '22 21:10

John Rotenstein