Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Zip file with utf-8 file names

In my website i have option to download all images uploaded by users. The problem is in images with hebrew names (i need original name of file). I tried to decode file names but this is not helping. Here is a code :

using ICSharpCode.SharpZipLib.Zip;

Encoding iso = Encoding.GetEncoding("ISO-8859-1");
Encoding utf8 = Encoding.UTF8;
byte[] utfBytes = utf8.GetBytes(file.Name);
byte[] isoBytes = Encoding.Convert(utf8, iso, utfBytes);
string name = iso.GetString(isoBytes);

var entry = new ZipEntry(name + ".jpg");
zipStream.PutNextEntry(entry);
using (var reader = new System.IO.FileStream(file.Name, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
{
    byte[] buffer = new byte[ChunkSize];
    int bytesRead;
    while ((bytesRead = reader.Read(buffer, 0, buffer.Length)) > 0)
    {
        byte[] actual = new byte[bytesRead];
        Buffer.BlockCopy(buffer, 0, actual, 0, bytesRead);
        zipStream.Write(actual, 0, actual.Length);
    }
} 

After utf-8 encoding i get hebrew file names like this : ??????.jpg Where is my mistake?

like image 301
freethinker Avatar asked Dec 20 '12 08:12

freethinker


2 Answers

Unicode (UTF-8 is one of the binary encoding) can represent more characters than the other 8-bit encoding. Moreover, you are not doing a proper conversion but a re-interpretation, which means that you get garbage for your filenames. You should really read the article from Joel on Unicode.

...

Now that you've read the article, you should know that in C# string can store unicode data, so you probably don't need to do any conversion of file.Name and can pass this directly to ZipEntry constructor if the library does not contains encoding handling bugs (this is always possible).

like image 99
Sylvain Defresne Avatar answered Sep 20 '22 08:09

Sylvain Defresne


Try using

ZipStrings.UseUnicode = true;

It should be a part of the ICSharpCode.SharpZipLib.Zip namespace.

After that you can use something like

var newZipEntry = new ZipEntry($"My ünicödë string.pdf");

and add the entry as normal to the stream. You shouldn't need to do any conversion of the string before that in C#.

like image 31
hug Avatar answered Sep 22 '22 08:09

hug