Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get Encoding list in .NET 1.1

I need to retrieve a list of supported encodings, but I'm using .NET 1.1, so the following call is not available:

using System;
using System.Text;

public class SamplesEncoding
{
   public static void Main()
   {
      // For every encoding, get the property values.
      foreach( EncodingInfo ei in Encoding.GetEncodings() )
      {
         Encoding e = ei.GetEncoding();

         Console.Write("{0,-6} {1,-25} ", ei.CodePage, ei.Name);
         Console.Write("{0,-8} {1,-8} ", e.IsBrowserDisplay, e.IsBrowserSave);
         Console.Write("{0,-8} {1,-8} ", e.IsMailNewsDisplay, e.IsMailNewsSave);
         Console.WriteLine("{0,-8} {1,-8} ", e.IsSingleByte, e.IsReadOnly);
      }
   }
}

The call Encoding.GetEncodings() is not available for .NET 1.1. Do you know any alternative method to get that list?

like image 508
Daniel Peñalba Avatar asked Sep 06 '11 13:09

Daniel Peñalba


3 Answers

Quite simple: .NET 1.1 is "fixed": it won't change. You take the Encodings of 2.0 and test if they were already present in 1.1. For example:

string[] encs = new string[] {
    "IBM037", "IBM437", "IBM500", "ASMO-708", "DOS-720", "ibm737",
    "ibm775", "ibm850", "ibm852", "IBM855", "ibm857", "IBM00858",
    "IBM860", "ibm861", "DOS-862", "IBM863", "IBM864", "IBM865",
    "cp866", "ibm869", "IBM870", "windows-874", "cp875",
    "shift_jis", "gb2312", "ks_c_5601-1987", "big5", "IBM1026",
    "IBM01047", "IBM01140", "IBM01141", "IBM01142", "IBM01143",
    "IBM01144", "IBM01145", "IBM01146", "IBM01147", "IBM01148",
    "IBM01149", "utf-16", "utf-16BE", "windows-1250",
    "windows-1251", "Windows-1252", "windows-1253", "windows-1254",
    "windows-1255", "windows-1256", "windows-1257", "windows-1258",
    "Johab", "macintosh", "x-mac-japanese", "x-mac-chinesetrad",
    "x-mac-korean", "x-mac-arabic", "x-mac-hebrew", "x-mac-greek",
    "x-mac-cyrillic", "x-mac-chinesesimp", "x-mac-romanian",
    "x-mac-ukrainian", "x-mac-thai", "x-mac-ce", "x-mac-icelandic",
    "x-mac-turkish", "x-mac-croatian", "utf-32", "utf-32BE",
    "x-Chinese-CNS", "x-cp20001", "x-Chinese-Eten", "x-cp20003",
    "x-cp20004", "x-cp20005", "x-IA5", "x-IA5-German",
    "x-IA5-Swedish", "x-IA5-Norwegian", "us-ascii", "x-cp20261",
    "x-cp20269", "IBM273", "IBM277", "IBM278", "IBM280", "IBM284",
    "IBM285", "IBM290", "IBM297", "IBM420", "IBM423", "IBM424",
    "x-EBCDIC-KoreanExtended", "IBM-Thai", "koi8-r", "IBM871",
    "IBM880", "IBM905", "IBM00924", "EUC-JP", "x-cp20936",
    "x-cp20949", "cp1025", "koi8-u", "iso-8859-1", "iso-8859-2",
    "iso-8859-3", "iso-8859-4", "iso-8859-5", "iso-8859-6",
    "iso-8859-7", "iso-8859-8", "iso-8859-9", "iso-8859-13",
    "iso-8859-15", "x-Europa", "iso-8859-8-i", "iso-2022-jp",
    "csISO2022JP", "iso-2022-jp", "iso-2022-kr", "x-cp50227",
    "euc-jp", "EUC-CN", "euc-kr", "hz-gb-2312", "GB18030",
    "x-iscii-de", "x-iscii-be", "x-iscii-ta", "x-iscii-te",
    "x-iscii-as", "x-iscii-or", "x-iscii-ka", "x-iscii-ma",
    "x-iscii-gu", "x-iscii-pa", "utf-7", "utf-8"
};

and

foreach (string enc in encs)
{
    try {
        Encoding.GetEncoding(enc);
    } catch {
        Console.WriteLine("Missing {0}", enc);
    }
}

(yes, that is the full list of encodings present in .NET 4.0... If you need them in "numeric" value, it would be quite easy to do it). Then you take the ones that don't work and take them away from that list.

To generate them (minimum .NET 2.0 and C# 3.0):

var encs = Encoding.GetEncodings().Select(p => p.Name);
//var encs = Encoding.GetEncodings().Select(p => p.CodePage);

var sb = new StringBuilder("var encs = new[] {");

foreach (var enc in encs) {
    sb.Append(" \"" + enc + "\",");
    //sb.Append(" " + enc + ",");
}

sb.Length--;

sb.Append(" };");

var str = sb.ToString();

Console.WriteLine(str);
like image 157
xanatos Avatar answered Oct 19 '22 10:10

xanatos


Assuming I haven't completely misunderstood the situation (quite possible), the following should work if you have about a minute or two to wait for it. It is really slow, not to mention a bit ugly.

public static Encoding[] GetList()
{
    ArrayList arrayList;
    arrayList = new ArrayList();
    for ( int i = 0; i < 65535; i++ )
    {
        try
        {
            arrayList.Add( Encoding.GetEncoding( i ) );
        }
        catch(Exception ex)
        {
        }
    }
    return (Encoding[])arrayList.ToArray( typeof( Encoding ) );
}
like image 36
Paul Walls Avatar answered Oct 19 '22 09:10

Paul Walls


The documentation says that you should use Encoding.GetEnconding(int CodePage) where CodePage should be one of:

The following Windows code pages exist:

874  Thai
932  Japanese
936  Chinese (simplified) (PRC, Singapore)
949  Korean
950  Chinese (traditional) (Taiwan, Hong Kong)
1200  Unicode (BMP of ISO 10646, UTF-16LE)
1201  Unicode (BMP of ISO 10646, UTF-16BE)
1250  Latin (Central European languages)
1251  Cyrillic
1252  Latin (Western European languages)
1253  Greek
1254  Turkish
1255  Hebrew
1256  Arabic
1257  Latin (Baltic languages)
1258  Vietnamese
65000  Unicode (BMP of ISO 10646, UTF-7)
65001  Unicode (BMP of ISO 10646, UTF-8)

Taken from Wikipedia

like image 29
Icarus Avatar answered Oct 19 '22 09:10

Icarus