Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unicode versions in .NET

The documentation of CharUnicodeInfo.GetUnicodeCategory says:

Note that CharUnicodeInfo.GetUnicodeCategory does not always return the same UnicodeCategory value as the Char.GetUnicodeCategory method when passed a particular character as a parameter.

The CharUnicodeInfo.GetUnicodeCategory method is designed to reflect the current version of the Unicode standard. In contrast, although the Char.GetUnicodeCategory method usually reflects the current version of the Unicode standard, it might return a character's category based on a previous version of the standard, or it might return a category that differs from the current standard to preserve backward compatibility.

So, which version of the Unicode standard is reflected by CharUnicodeInfo.GetUnicodeCategory and Char.GetUnicodeCategory in which version of the .NET Framework?

like image 519
dtb Avatar asked Jul 16 '09 02:07

dtb


People also ask

How many Unicode encodings are there?

Unicode: Three Encodings the case. Unicode defines three encodings with different size code unit for different purposes.

What is Unicode 11?

Unicode is a universal character encoding standard. This standard includes roughly 100000 characters to represent characters of different languages. While ASCII uses only 1 byte the Unicode uses 4 bytes to represent characters. Hence, it provides a very wide variety of encoding.

Does UTF-8 include Unicode?

UTF-8 is a Unicode character encoding method. This means that UTF-8 takes the code point for a given Unicode character and translates it into a string of binary. It also does the reverse, reading in binary digits and converting them back to characters.

What is the difference between UTF-8 and Unicode?

The Difference Between Unicode and UTF-8Unicode is a character set. UTF-8 is encoding. Unicode is a list of characters with unique decimal numbers (code points).


1 Answers

The documentation for the String Class states the Unicode version that the .NET Framework 4 and 4.5 conform to:

.NET Framework 4

In the .NET Framework 4, sorting, casing, normalization, and Unicode character information is synchronized with Windows 7 and conforms to the Unicode 5.1 standard.

.NET Framework 4.5

In the .NET Framework 4.5 running on the Windows 8 operating system, sorting, casing, normalization, and Unicode character information conforms to the Unicode 6.0 standard. On other operating systems, it conforms to the Unicode 5.0 standard.

like image 90
dtb Avatar answered Sep 25 '22 00:09

dtb