Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Modifying a text file in a ZIP archive in Java

My use case requires me to open a txt file, say abc.txt which is inside a zip archive which contains key-value pairs in the form

key1=value1

key2=value2

.. and so on where each key-value pair is in a new line. I have to change one value corresponding to a certain key and put the text file back in a new copy of the archive. How do I do this in java?

My attempt so far:

    ZipFile zipFile = new ZipFile("test.zip");
    final ZipOutputStream zos = new ZipOutputStream(new FileOutputStream("out.zip"));
    for(Enumeration e = zipFile.entries(); e.hasMoreElements(); ) {
        ZipEntry entryIn = (ZipEntry) e.nextElement();
        if(!entryIn.getName().equalsIgnoreCase("abc.txt")){
            zos.putNextEntry(entryIn);
            InputStream is = zipFile.getInputStream(entryIn);
            byte [] buf = new byte[1024];
            int len;
            while((len = (is.read(buf))) > 0) {            
                zos.write(buf, 0, len);
            }
        }
        else{
            // I'm not sure what to do here
            // Tried a few things and the file gets corrupt
        }
        zos.closeEntry();
    }
    zos.close();
like image 693
Prabhakar Avatar asked Jul 16 '12 10:07

Prabhakar


3 Answers

Java 7 introduced a much simpler way for doing zip archive manipulations - FileSystems API, which allows to access contents of a file as a file system.

Besides much more straightforward API, it is doing the modification in-place and doesn't require to rewrite other (irrelevant) files in a zip archive (as done in the accepted answer).

Here's sample code that solves OP's use case:

import java.io.*; import java.nio.file.*;  public static void main(String[] args) throws IOException {     modifyTextFileInZip("test.zip"); }  static void modifyTextFileInZip(String zipPath) throws IOException {     Path zipFilePath = Paths.get(zipPath);     try (FileSystem fs = FileSystems.newFileSystem(zipFilePath, null)) {         Path source = fs.getPath("/abc.txt");         Path temp = fs.getPath("/___abc___.txt");         if (Files.exists(temp)) {             throw new IOException("temp file exists, generate another name");         }         Files.move(source, temp);         streamCopy(temp, source);         Files.delete(temp);     } }  static void streamCopy(Path src, Path dst) throws IOException {     try (BufferedReader br = new BufferedReader(             new InputStreamReader(Files.newInputStream(src)));          BufferedWriter bw = new BufferedWriter(             new OutputStreamWriter(Files.newOutputStream(dst)))) {          String line;         while ((line = br.readLine()) != null) {             line = line.replace("key1=value1", "key1=value2");             bw.write(line);             bw.newLine();         }     } } 

For more zip archive manipulation examples, see demo/nio/zipfs/Demo.java sample which you can download here (look for JDK 8 Demos and Samples).

like image 193
Alex Lipov Avatar answered Nov 03 '22 10:11

Alex Lipov


You had almost got it right. One possible reason, the file was shown as corrupted is that you might have used

zos.putNextEntry(entryIn)

in the else part as well. This creates a new entry in the zip file containing information from the existing zip file. Existing information contains entry name(file name) and its CRC among other things.

And then, when u try to update the text file and close the zip file, it will throw an error as the CRC defined in the entry and the CRC of the object you are trying to write differ.

Also u might get an error if the length of the text that you are trying to replace is different than the one existing i.e. you are trying to replace

key1=value1

with

key1=val1

This boils down to the problem that the buffer you are trying to write to has length different than the one specified.

ZipFile zipFile = new ZipFile("test.zip");
final ZipOutputStream zos = new ZipOutputStream(new FileOutputStream("out.zip"));
for(Enumeration e = zipFile.entries(); e.hasMoreElements(); ) {
    ZipEntry entryIn = (ZipEntry) e.nextElement();
    if (!entryIn.getName().equalsIgnoreCase("abc.txt")) {
        zos.putNextEntry(entryIn);
        InputStream is = zipFile.getInputStream(entryIn);
        byte[] buf = new byte[1024];
        int len;
        while((len = is.read(buf)) > 0) {            
            zos.write(buf, 0, len);
        }
    }
    else{
        zos.putNextEntry(new ZipEntry("abc.txt"));

        InputStream is = zipFile.getInputStream(entryIn);
        byte[] buf = new byte[1024];
        int len;
        while ((len = (is.read(buf))) > 0) {
            String s = new String(buf);
            if (s.contains("key1=value1")) {
                buf = s.replaceAll("key1=value1", "key1=val2").getBytes();
            }
            zos.write(buf, 0, (len < buf.length) ? len : buf.length);
        }
    }
    zos.closeEntry();
}
zos.close();

The following code ensures that even if data that is replaced is of less length than the original length, no IndexOutOfBoundsExceptions occur.

(len < buf.length) ? len : buf.length

like image 44
Shiva Avatar answered Nov 03 '22 10:11

Shiva


Only a little improvement to:

else{
    zos.putNextEntry(new ZipEntry("abc.txt"));

    InputStream is = zipFile.getInputStream(entryIn);
    byte[] buf = new byte[1024];
    int len;
    while ((len = (is.read(buf))) > 0) {
        String s = new String(buf);
        if (s.contains("key1=value1")) {
            buf = s.replaceAll("key1=value1", "key1=val2").getBytes();
        }
        zos.write(buf, 0, (len < buf.length) ? len : buf.length);
    }
}

That should be:

else{
    zos.putNextEntry(new ZipEntry("abc.txt"));

    InputStream is = zipFile.getInputStream(entryIn);
    long size = entry.getSize();
    if (size > Integer.MAX_VALUE) {
        throw new IllegalStateException("...");
    }
    byte[] bytes = new byte[(int)size];
    is.read(bytes);
    zos.write(new String(bytes).replaceAll("key1=value1", "key1=val2").getBytes());
}

In order to capture all the occurrences

The reason is that, with the first, you could have "key1" in one read and "=value1" in the next, not being able to capture the occurrence you want to change

like image 21
Santiago Ruiz Avatar answered Nov 03 '22 10:11

Santiago Ruiz