Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to estimate zip file size in java before creating it

Tags:

java

zip

I am having a requirement wherein i have to create a zip file from a list of available files. The files are of different types like txt,pdf,xml etc.I am using java util classes to do it.

The requirement here is to maintain a maximum file size of 5 mb. I should select the files from list based on timestamp, add the files to zip until the zip file size reaches 5 mb. I should skip the remaining files.

Please let me know if there is a way in java where in i can estimate the zip file size in advance without creating actual file?

Or is there any other approach to handle this

like image 784
Vignesh Avatar asked Aug 26 '10 06:08

Vignesh


People also ask

How do I tell how big a ZIP file is?

If you type unzip -l <zipfile> , it prints a listing of files within the zip, with their uncompressed sizes, then the total uncompressed size of all of them. This is human-readable output, but you can get a machine-readable number using unzip -l <zipfile> | tail -n1 | awk '{ print $1 }' . Save this answer.

How can we get the size of a specific file in Java?

Java get file size using File classJava File length() method returns the file size in bytes. The return value is unspecified if this file denotes a directory.

Does creating a ZIP file reduce the size?

You can compress, or zip, the file in Windows, which shrinks the size of the file but retains the original quality of your presentation. You can also compress the media files within the presentation so they're a smaller file size and easier to send.

How big is an empty ZIP file?

Always empty zip files will be 22 bytes (or say less than 100bytes) ? How to categorize empty & corrupt zip files ?


2 Answers

Wrap your ZipOutputStream into a personalized OutputStream, named here YourOutputStream.

  • The constructor of YourOutputStream will create another ZipOutputStream (zos2) which wraps a new ByteArrayOutputStream (baos)
    public YourOutputStream(ZipOutputStream zos, int maxSizeInBytes)
  • When you want to write a file with YourOutputStream, it will first write it on zos2
    public void writeFile(File file) throws ZipFileFullException
    public void writeFile(String path) throws ZipFileFullException
    etc...
  • if baos.size() is under maxSizeInBytes
    • Write the file in zos1
  • else
    • close zos1, baos, zos2 an throw an exception. For the exception, I can't think of an already existant one, if there is, use it, else create your own IOException ZipFileFullException.

You need two ZipOutputStream, one to be written on your drive, one to check if your contents is over 5MB.

EDIT : In fact I checked, you can't remove a ZipEntry easily.

http://download.oracle.com/javase/6/docs/api/java/io/ByteArrayOutputStream.html#size()

like image 54
Colin Hebert Avatar answered Oct 13 '22 19:10

Colin Hebert


+1 for Colin Herbert: Add files one by one, either back up the previous step or removing the last file if the archive is to big. I just want to add some details:

Prediction is way too unreliable. E.g. a PDF can contain uncompressed text, and compress down to 30% of the original, or it contains already-compressed text and images, compressing to 80%. You would need to inspect the entire PDF for compressibility, basically having to compress them.

You could try a statistical prediction, but that could reduce the number of failed attempts, but you would still have to implement above recommendation. Go with the simpler implementation first, and see if it's enough.

Alternatively, compress files individually, then pick the files that won't exceedd 5 MB if bound together. If unpacking is automated, too, you could bind the zip files into a single uncompressed zip file.

like image 25
peterchen Avatar answered Oct 13 '22 21:10

peterchen