How to estimate zip file size in java before creating it

Tags:

zip

I am having a requirement wherein i have to create a zip file from a list of available files. The files are of different types like txt,pdf,xml etc.I am using java util classes to do it.

The requirement here is to maintain a maximum file size of 5 mb. I should select the files from list based on timestamp, add the files to zip until the zip file size reaches 5 mb. I should skip the remaining files.

Please let me know if there is a way in java where in i can estimate the zip file size in advance without creating actual file?

Or is there any other approach to handle this

784

asked Aug 26 '10 06:08

Vignesh

2 Answers

Wrap your ZipOutputStream into a personalized OutputStream, named here YourOutputStream.

The constructor of YourOutputStream will create another ZipOutputStream (zos2) which wraps a new ByteArrayOutputStream (baos)
public YourOutputStream(ZipOutputStream zos, int maxSizeInBytes)
When you want to write a file with YourOutputStream, it will first write it on zos2
public void writeFile(File file) throws ZipFileFullException
public void writeFile(String path) throws ZipFileFullException
etc...
if baos.size() is under maxSizeInBytes
- Write the file in zos1
else
- close zos1, baos, zos2 an throw an exception. For the exception, I can't think of an already existant one, if there is, use it, else create your own IOException ZipFileFullException.

You need two ZipOutputStream, one to be written on your drive, one to check if your contents is over 5MB.

EDIT : In fact I checked, you can't remove a ZipEntry easily.

http://download.oracle.com/javase/6/docs/api/java/io/ByteArrayOutputStream.html#size()

answered Oct 13 '22 19:10

Colin Hebert

+1 for Colin Herbert: Add files one by one, either back up the previous step or removing the last file if the archive is to big. I just want to add some details:

Prediction is way too unreliable. E.g. a PDF can contain uncompressed text, and compress down to 30% of the original, or it contains already-compressed text and images, compressing to 80%. You would need to inspect the entire PDF for compressibility, basically having to compress them.

You could try a statistical prediction, but that could reduce the number of failed attempts, but you would still have to implement above recommendation. Go with the simpler implementation first, and see if it's enough.

Alternatively, compress files individually, then pick the files that won't exceedd 5 MB if bound together. If unpacking is automated, too, you could bind the zip files into a single uncompressed zip file.

answered Oct 13 '22 21:10

peterchen

Related questions
                            
                                Why do method() and super.method() refer to different things in an anonymous subclass?
                            
                                Setting -XX:MaxRam
                            
                                java.lang.NoClassDefFoundError: org/springframework/data/repository/config/BootstrapMode
                            
                                How to implement freehand image cropping in android?
                            
                                Secure and effective way for waiting for asynchronous task
                            
                                Programmatically inspect .class files
                            
                                Java RMI Tutorial - AccessControlException: access denied (java.io.FilePermission
                            
                                How does one manage object pooling in Spring?
                            
                                Ehcache & MultiThreading
                            
                                Spring ApplicationContext Bean Scope
                            
                                Program output lost when passed through PsExec
                            
                                Get the Raw Request String from HttpServletRequest
                            
                                How to prevent tomcat session hijacking?
                            
                                Invalid access of stack red zone from Java VM
                            
                                What do -XX:-PrintGC and XX:-PrintGCDetails flags do?
                            
                                How do I get maven managed dependencies copied into war\web-inf\lib so I can run my GWT 2.0 app in debug mode within Eclipse?
                            
                                How to unload an already loaded class in Java? [duplicate]
                            
                                Are there any examples/tutorials of using Spring 3.0 with Cassandra as a backend? [closed]
                            
                                Meta Search Engine Architecture
                            
                                how do I create my own training corpus for stanford tagger?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With