Relation between .NET Encoding and Characterset

Question

What's relation between CharacterSet here:
http://msdn.microsoft.com/en-us/library/ms709353(VS.85).aspx
and ascii encoding here:
http://msdn.microsoft.com/en-us/library/system.text.asciiencoding.getbytes(VS.71).aspx

Joe · Accepted Answer

ANSI is the current Windows ANSI code page, equivalent to Encoding.Default.

OEM is the current OEM code page typically used by console applications.

You can get this using:

Encoding.GetEncoding(CultureInfo.CurrentCulture.TextInfo.OEMCodePage)

In a console application, the OEM encoding will also be available using

Console.OutputEncoding

Hans Passant · Answer

This is really, really ancient. ODBC dates from the stone age, back when Windows starting taking over from MS-DOS. Back then, lots of text was still encoded in the original IBM-PC character set, named the "OEM Character Set" by Microsoft. The standard IBM-PC set had some accented characters and pseudo graphics glyphs in the upper half, codes 0x80-0xff.

Too limited for text output in non-English languages, Microsoft started using code pages, ranges of character glyphs suitable for a certain language group. The American English set of characters were standardized by ANSI, that label is now attached (incorrectly) to any non-OEM code page.

Nobody encodes text in the OEM character set anymore, it went the way of the dodo at least 10 years ago. The proper setting here is ANSI. And keeping your fingers crossed behind your back that the code page used to encode the text matches your system's default code page. That's dodo too, Unicode solved it.

Relation between .NET Encoding and Characterset

Tags:

c#

character-encoding

codepages

programmernovice

2 Answers

Joe

Hans Passant

Recent Activity

Donate For Us

Relation between .NET Encoding and Characterset

Tags:

c#

character-encoding

codepages

programmernovice

2 Answers

Joe

Hans Passant

Related questions

Recent Activity

Donate For Us