I need a way to convert special characters like this:
Helloæ
To normal characters. So this word would end up being Helloae
. So far I have tried HttpUtility.Decode
, or a method that would convert UTF8 to win1252, but nothing worked. Is there something simple and generic that would do this job?
Thank you.
EDIT
I have tried implementing those two methods using posts here on OC. Here's the methods:
public static string ConvertUTF8ToWin1252(string _source)
{
Encoding utf8 = new UTF8Encoding();
Encoding win1252 = Encoding.GetEncoding(1252);
byte[] input = _source.ToUTF8ByteArray();
byte[] output = Encoding.Convert(utf8, win1252, input);
return win1252.GetString(output);
}
// It should be noted that this method is expecting UTF-8 input only,
// so you probably should give it a more fitting name.
private static byte[] ToUTF8ByteArray(this string _str)
{
Encoding encoding = new UTF8Encoding();
return encoding.GetBytes(_str);
}
But it did not worked. The string remains the same way.
See: Does .NET transliteration library exists?
UnidecodeSharpFork
Usage:
var result = "Helloæ".Unidecode();
Console.WriteLine(result) // Prints Helloae
There is no direct mapping between æ
and ae
they are completely different unicode code points. If you need to do this you'll most likely need to write a function that maps the offending code points to the strings that you desire.
Per the comments you may need to take a two stage approach to this:
switch(badChar){
case 'æ':
return "ae";
case 'ø':
return "oe";
// and so on
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With