Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where does (char)int get its symbols from?

Tags:

c#

char

Being a computer programming rookie, I was given homework involving the use of the playing card suit symbols. In the course of my research I came across an easy way to retrieve the symbols:

 Console.Write((char)6);

gives you ♠

  Console.Write((char)3);

gives you ♥

and so on...

However, I still don't understand what logic C# uses to retrieve those symbols. I mean, the ♠ symbol in the Unicode table is U+2660, yet I didn't use it. The ASCII table doesn't even contain these symbols.

So my question is, what is the logic behind (char)int?

like image 894
VaVa Avatar asked Jan 31 '16 17:01

VaVa


People also ask

Can char contain symbols?

The char data type was designed to hold a single character . A character can be a single letter, number, symbol, or whitespace.

Can a char be an integer?

Char is an "Integer Data Type" in C and its related progeny. As per the K&R Book, "By definition, chars are just small integers". They are used for storing 'Character' data. The ASCII Table lists 128 characters, and each text character corresponds to an integer value.

Can you use char for numbers?

The CHAR data type stores any string of letters, numbers, and symbols.


2 Answers

For these low numbers (below 32), this is an aspect of the console rather than C#, and it comes from Code page 437 - though it won't include the ones that have other meanings that the console actually uses, such as tab, carriage return, and bell. This isn't really portable to any context where you're not running directly in a console window, and you should use e.g. 0x2660 instead, or just '\u2660'.

like image 174
Random832 Avatar answered Sep 19 '22 00:09

Random832


The logic behind (char)int is that char is a UTF-16 code unit, one or two of which encode a Unicode codepoint. Codepoints are naturally ordinal numbers, being an identifier for a member of a character set. They are often written in hexadecimal, and specifically for Unicode, preceded by U+, for example U+2660.

UTF-16 is a mapping between codepoint and code units. Code units being 16 bits can be operated on as integers. Since a char holds one code unit, you can convert an short to a char. Since the different integer types can interoperate, you can convert an int to a char.

So, your short (or int) has meaning as text only when it represents a UTF-16 code unit for a codepoint that only has one code unit. (You could also convert an int holding a whole codepoint to a string.)

Of course, you could let the compiler figure it out for you and make it easier for your readers, too, with: Console.Write('♥');

Also, forget ASCII. It's never the right encoding (except when it is). In case it's not clear, a string is a counted sequence of UTF-16 code units.

like image 34
Tom Blodget Avatar answered Sep 22 '22 00:09

Tom Blodget