Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert hebrew (unicode) to Ascii in c#?

I have to create some sort of text file in which there are numbers and Hebrew letters decoded to ASCII.

This is file creation method which triggers on ButtonClick

protected void ToFile(object sender, EventArgs e)
{
    filename = Transactions.generateDateYMDHMS();
    string path = string.Format("{0}{1}.001", Server.MapPath("~/transactions/"), filename);
    StreamWriter sw = new StreamWriter(path, false, Encoding.ASCII);
    sw.WriteLine("hello");
    sw.WriteLine(Transactions.convertUTF8ASCII("שלום"));
    sw.WriteLine("bye");
    sw.Close();
}

as you can see, i use Transactions.convertUTF8ASCII() static method to convert from probably Unicode string from .NET to ASCII representation of it. I use it on term Hebrew 'shalom' and get back '????' instead of result i need.

Here is the method.

public static string convertUTF8ASCII(string initialString)
{
    byte[] unicodeBytes = Encoding.Unicode.GetBytes(initialString);
    byte[] asciiBytes = Encoding.Convert(Encoding.Unicode, Encoding.ASCII, unicodeBytes);
    return Encoding.ASCII.GetString(asciiBytes);
}

Instead of having initial word decoded to ASCII i get '????' in the file i create even if i run debbuger i get same result.

What i'm doing wrong ?

like image 239
eugeneK Avatar asked Sep 06 '10 08:09

eugeneK


3 Answers

You can't simply translate arbitrary unicode characters to ASCII. The best it can do is discard the unsupportable characters, hence ????. Obviously the basic 7-bit characters will work, but not much else. I'm curious as to what the expected result is?

If you need this for transfer (rather than representation) you might consider base-64 encoding of the underlying UTF8 bytes.

like image 52
Marc Gravell Avatar answered Oct 11 '22 18:10

Marc Gravell


Do you perhaps mean ANSI, not ASCII?

ASCII doesn't define any Hebrew characters. There are however some ANSI code pages which do such as "windows-1255"

In which case, you may want to consider looking at: http://msdn.microsoft.com/en-us/library/system.text.encoding.aspx

In short, where you have:

Encoding.ASCII

You would replace it with:

Encoding.GetEncoding(1255)
like image 2
userx Avatar answered Oct 11 '22 18:10

userx


Are you perhaps asking about transliteration (as in "Romanization") instead of encoding conversion, if you really are talking about ASCII?

like image 1
peSHIr Avatar answered Oct 11 '22 18:10

peSHIr