I have quite some amount of streamable data (>100MB), which, for the sake of compression, i would like to host packed in a zipfile on an http-server. So this zipfile contains a single file.
Now is it possible for a java-client to stream the data via http, even though it is packed in a zipfile?
According to wikipedia, ZIPs are not sequentially...
http://en.wikipedia.org/wiki/ZIP_(file_format)#Structure
If this is still possible somehow, then how?
edit: about gzip: as i said, i use a custom java client (not a webbrowser) is gzip available in the java http implementation?
To read the file represented by a ZipEntry you can obtain an InputStream from the ZipFile like this: ZipEntry zipEntry = zipFile. getEntry("dir/subdir/file1. txt"); InputStream inputStream = this.
Methods. getComment(): String – returns the zip file comment, or null if none. getEntry(String name): ZipEntry – returns the zip file entry for the specified name, or null if not found. getInputStream(ZipEntry entry) : InputStream – Returns an input stream for reading the contents of the specified zip file entry.
Java API provides extensive support to read Zip files, all classes related to zip file processing are located in the java. util. zip package. One of the most common tasks related to zip archive is to read a Zip file and display what entries it contains, and then extract them in a folder.
To unzip a zip file, we need to read the zip file with ZipInputStream and then read all the ZipEntry one by one. Then use FileOutputStream to write them to file system. We also need to create the output directory if it doesn't exists and any nested directories present in the zip file.
Here's a snippet of code (that works) that the client can use to read from the zipped stream:
static void processZippedInputStream(InputStream in, String entryNameRegex)
throws IOException
{
ZipInputStream zin = new ZipInputStream(in);
ZipEntry ze;
while ((ze = zin.getNextEntry()) != null)
{
if (ze.getName().matches(entryNameRegex))
{
// treat zin as a normal input stream - ie read() from it till "empty" etc
break;
}
zin.closeEntry();
}
zin.close();
}
The main difference with a normal InputStream is iterating through the entries. You may know, for example, that you want the first entry, so no need for the name matching parameter etc.
Java supports the gzip
format with the GZipInputStream
(decompressing) and GZipOutputStream
(compressing). Both zip
and gzip
use the same compressing format internally, the main difference is in the metadata: zip
has it at the end of the file, gzip
at the beginning (and gzip
only supports one enclosed file easily).
For your of streaming one big file, using gzip
will be the better thing to do - even more as you don't need access to the metadata.
I'm not sure if the HTTPConnection sends Accept-Encoding: gzip
and then handles inflating the content automatically if the server delivers it with Content-Encoding: gzip
, but you surely can do it manually if the server simply sends a the .gz
file as such (i.e. with Content-Encoding: identity
).
(By the way, make sure to read from the stream with not too small buffers, as each deflate call will have a native call overhead, since Java's GZipInputStream uses the native zlib implementation.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With