I need to retrieve a list of supported encodings, but I'm using .NET 1.1, so the following call is not available:
using System;
using System.Text;
public class SamplesEncoding
{
public static void Main()
{
// For every encoding, get the property values.
foreach( EncodingInfo ei in Encoding.GetEncodings() )
{
Encoding e = ei.GetEncoding();
Console.Write("{0,-6} {1,-25} ", ei.CodePage, ei.Name);
Console.Write("{0,-8} {1,-8} ", e.IsBrowserDisplay, e.IsBrowserSave);
Console.Write("{0,-8} {1,-8} ", e.IsMailNewsDisplay, e.IsMailNewsSave);
Console.WriteLine("{0,-8} {1,-8} ", e.IsSingleByte, e.IsReadOnly);
}
}
}
The call Encoding.GetEncodings()
is not available for .NET 1.1. Do you know any alternative method to get that list?
Quite simple: .NET 1.1 is "fixed": it won't change. You take the Encodings of 2.0 and test if they were already present in 1.1. For example:
string[] encs = new string[] {
"IBM037", "IBM437", "IBM500", "ASMO-708", "DOS-720", "ibm737",
"ibm775", "ibm850", "ibm852", "IBM855", "ibm857", "IBM00858",
"IBM860", "ibm861", "DOS-862", "IBM863", "IBM864", "IBM865",
"cp866", "ibm869", "IBM870", "windows-874", "cp875",
"shift_jis", "gb2312", "ks_c_5601-1987", "big5", "IBM1026",
"IBM01047", "IBM01140", "IBM01141", "IBM01142", "IBM01143",
"IBM01144", "IBM01145", "IBM01146", "IBM01147", "IBM01148",
"IBM01149", "utf-16", "utf-16BE", "windows-1250",
"windows-1251", "Windows-1252", "windows-1253", "windows-1254",
"windows-1255", "windows-1256", "windows-1257", "windows-1258",
"Johab", "macintosh", "x-mac-japanese", "x-mac-chinesetrad",
"x-mac-korean", "x-mac-arabic", "x-mac-hebrew", "x-mac-greek",
"x-mac-cyrillic", "x-mac-chinesesimp", "x-mac-romanian",
"x-mac-ukrainian", "x-mac-thai", "x-mac-ce", "x-mac-icelandic",
"x-mac-turkish", "x-mac-croatian", "utf-32", "utf-32BE",
"x-Chinese-CNS", "x-cp20001", "x-Chinese-Eten", "x-cp20003",
"x-cp20004", "x-cp20005", "x-IA5", "x-IA5-German",
"x-IA5-Swedish", "x-IA5-Norwegian", "us-ascii", "x-cp20261",
"x-cp20269", "IBM273", "IBM277", "IBM278", "IBM280", "IBM284",
"IBM285", "IBM290", "IBM297", "IBM420", "IBM423", "IBM424",
"x-EBCDIC-KoreanExtended", "IBM-Thai", "koi8-r", "IBM871",
"IBM880", "IBM905", "IBM00924", "EUC-JP", "x-cp20936",
"x-cp20949", "cp1025", "koi8-u", "iso-8859-1", "iso-8859-2",
"iso-8859-3", "iso-8859-4", "iso-8859-5", "iso-8859-6",
"iso-8859-7", "iso-8859-8", "iso-8859-9", "iso-8859-13",
"iso-8859-15", "x-Europa", "iso-8859-8-i", "iso-2022-jp",
"csISO2022JP", "iso-2022-jp", "iso-2022-kr", "x-cp50227",
"euc-jp", "EUC-CN", "euc-kr", "hz-gb-2312", "GB18030",
"x-iscii-de", "x-iscii-be", "x-iscii-ta", "x-iscii-te",
"x-iscii-as", "x-iscii-or", "x-iscii-ka", "x-iscii-ma",
"x-iscii-gu", "x-iscii-pa", "utf-7", "utf-8"
};
and
foreach (string enc in encs)
{
try {
Encoding.GetEncoding(enc);
} catch {
Console.WriteLine("Missing {0}", enc);
}
}
(yes, that is the full list of encodings present in .NET 4.0... If you need them in "numeric" value, it would be quite easy to do it). Then you take the ones that don't work and take them away from that list.
To generate them (minimum .NET 2.0 and C# 3.0):
var encs = Encoding.GetEncodings().Select(p => p.Name);
//var encs = Encoding.GetEncodings().Select(p => p.CodePage);
var sb = new StringBuilder("var encs = new[] {");
foreach (var enc in encs) {
sb.Append(" \"" + enc + "\",");
//sb.Append(" " + enc + ",");
}
sb.Length--;
sb.Append(" };");
var str = sb.ToString();
Console.WriteLine(str);
Assuming I haven't completely misunderstood the situation (quite possible), the following should work if you have about a minute or two to wait for it. It is really slow, not to mention a bit ugly.
public static Encoding[] GetList()
{
ArrayList arrayList;
arrayList = new ArrayList();
for ( int i = 0; i < 65535; i++ )
{
try
{
arrayList.Add( Encoding.GetEncoding( i ) );
}
catch(Exception ex)
{
}
}
return (Encoding[])arrayList.ToArray( typeof( Encoding ) );
}
The documentation says that you should use Encoding.GetEnconding(int CodePage)
where CodePage
should be one of:
The following Windows code pages exist:
874 — Thai
932 — Japanese
936 — Chinese (simplified) (PRC, Singapore)
949 — Korean
950 — Chinese (traditional) (Taiwan, Hong Kong)
1200 — Unicode (BMP of ISO 10646, UTF-16LE)
1201 — Unicode (BMP of ISO 10646, UTF-16BE)
1250 — Latin (Central European languages)
1251 — Cyrillic
1252 — Latin (Western European languages)
1253 — Greek
1254 — Turkish
1255 — Hebrew
1256 — Arabic
1257 — Latin (Baltic languages)
1258 — Vietnamese
65000 — Unicode (BMP of ISO 10646, UTF-7)
65001 — Unicode (BMP of ISO 10646, UTF-8)
Taken from Wikipedia
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With