Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read tar.gz in Java with Commons-compression

Ok so I want to read the contents of a tar.gz file (or a xy) but that's the same thing. What I am doing is more or less this:

TarArchiveInputStream tarInput = new TarArchiveInputStream(new GzipCompressorInputStream(new FileInputStream("c://temp//test.tar.gz")));
TarArchiveEntry currentEntry = tarInput.getNextTarEntry();
BufferedReader br = null;
StringBuilder sb = new StringBuilder();
while (currentEntry != null) {
    File f = currentEntry.getFile();
    br = new BufferedReader(new FileReader(f));
    System.out.println("For File = " + currentEntry.getName());
    String line;
    while ((line = br.readLine()) != null) {
        System.out.println("line="+line);
    }
}
if (br!=null) {
    br.close();
}

But I get null when I call the getFile method of TarArchiveEntry.
I am using Apache commons compress 1.8.1

like image 770
zpontikas Avatar asked Sep 09 '14 16:09

zpontikas


People also ask

Does 7zip support tar gz?

Supported formats: Packing / unpacking: 7z, XZ, BZIP2, GZIP, TAR, ZIP and WIM.

Is tarball better than zip?

The advantage of ZIP is you have random access to the files in the ZIP, without having the decompress the whole thing, but as a side effect, files don't share their compression dictionaries. On the other hand, tar files can get automatic deduplication because gzip and xz see the entire tar file as one continuous file.


1 Answers

You can't use the getFile of TarArchiveEntry. That getter is there only for the opposite operation, when you are compressing files inside a tar file.

Instead, you should read directly from TarArchiveInputStream. It will take care of returning you the content of the "file" decompressing it on the fly.

For example (untested code, YMMV) :

TarArchiveInputStream tarInput = new TarArchiveInputStream(new GzipCompressorInputStream(new FileInputStream("c://temp//test.tar.gz")));
TarArchiveEntry currentEntry = tarInput.getNextTarEntry();
BufferedReader br = null;
StringBuilder sb = new StringBuilder();
while (currentEntry != null) {
    br = new BufferedReader(new InputStreamReader(tarInput)); // Read directly from tarInput
    System.out.println("For File = " + currentEntry.getName());
    String line;
    while ((line = br.readLine()) != null) {
        System.out.println("line="+line);
    }
    currentEntry = tarInput.getNextTarEntry(); // You forgot to iterate to the next file
}
like image 90
Simone Gianni Avatar answered Nov 12 '22 01:11

Simone Gianni