I was trying to convert a file from utf-8 to Arabic-1265 encoding using the Encoding APIs in C#, but I faced a strange problem that some characters are not converted correctly such as "لا" in the following statement "ﻣﺣﻣد ﺻﻼ ح عادل" it appears as "ﻣﺣﻣد ﺻ? ح عادل". Some of my friends told me that this is because these characters are from the Arabic Presentation Forms B. I create the file using notepad++ and save it as utf-8.
here is the code I use
StreamReader sr = new StreamReader(@"C:\utf-8.txt", Encoding.UTF8);
string str = sr.ReadLine();
StreamWriter sw = new StreamWriter(@"C:\windows-1256.txt", false, Encoding.GetEncoding("windows-1256"));
sw.Write(str);
sw.Flush();
sw.Close();
But, I don't know how to convert the file correctly using this presentation forms in C#.
Yes, your string contains lots of ligatures that cannot be represented in the 1256 code page. You'll have to decompose the string before writing it. Like this:
str = str.Normalize(NormalizationForm.FormKD);
st.Write(str);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With