I am trying to develop code that can handle zipping files with non-English characters (Umlaut, Arabic etc) but the zipped file contains improper names. I am using java version 1.7.0_45 thus it shouldn't be due to the bug mentioned here.I am setting the charset to UTF-8 for the ZipOutputStream
constructor and as per Javadocs it should work as per my requirements.
I am assured that the zip file is being written correctly as an attempt to read entries from the file gives proper filenames (as expected).
However, when I try to open/extract with either Ubuntu default ArchiveManager/Unzip tool, the filenames are messed up.
Here is my code :
private void convertFilesToZip(List<File> files) {
FileInputStream inputStream = null;
try {
byte[] buffer = new byte[1024];
FileOutputStream fileOutputStream = new FileOutputStream("zipFile.zip");
ZipOutputStream outputStream = new ZipOutputStream(fileOutputStream, Charset.forName("UTF-8"));
for (File file : files) {
inputStream = new FileInputStream(file);
String filename = file.getName();
System.out.println("Adding file : " + filename);
outputStream.putNextEntry(new ZipEntry(filename));
int length;
while ((length = inputStream.read(buffer)) > 0) {
outputStream.write(buffer, 0, length);
}
outputStream.closeEntry();
}
if(inputStream != null) inputStream.close();
outputStream.close();
System.out.println("Zip created successfully");
System.out.println("=======================================================");
System.out.println("Reading zip Entries");
ZipInputStream zipInputStream = new ZipInputStream(new FileInputStream(new File("zipFile.zip")), Charset.forName("UTF-8"));
ZipEntry zipEntry;
while((zipEntry=zipInputStream.getNextEntry())!=null){
System.out.println(zipEntry.getName());
zipInputStream.closeEntry();
}
zipInputStream.close();
} catch (IOException exception) {
exception.printStackTrace();
}
}
the output for the files is as follows:
Adding file : umlaut_ḧ.txt
Adding file : ذ ر ز س ش ص ض.txt
Adding file : äǟc̈ḧös̈ ẗǚẍŸ_uploadFile4.txt
Adding file : pingüino.txt
Adding file : ÄÖÜäöüß- Español deEspaña.ppt
Zip created successfully
=======================================================
Reading zip Entries
umlaut_ḧ.txt
ذ ر ز س ش ص ض.txt
äǟc̈ḧös̈ ẗǚẍŸ_uploadFile4.txt
pingüino.txt
ÄÖÜäöüß- Español deEspaña.ppt
Has anyone successfully implemented what I wish to achieve here.
Can someone point me to what I may have missed or have been doing wrong.I did all the google I could and even tried Apache Commons Compress
but still no luck.
It's mentioned in the bug report that the bug is resolved in Java 7, then why is the code not working.
[Update] I finally figured out that the problem is not in the code but is in fact with the default ArchiveManager of Ubuntu. It doesn't recognizes/extracts the contents properly. When the same file is opened/extracted by the windows zip handler, it works flawlessly.
Additionally, the commons-compress supports a bunch of other formats too apart from the zip,gzip supported by Java.
http://commons.apache.org/proper/commons-compress/index.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With