For this C#, a==true
:
bool a = "hello" + '/' + "world" == "hello/world";
And for this C#, b==true
:
bool b = "hello" + + '/' + "world" == "hello47world";
I'm wondering how this can be, and more importantly, why did the C# language architects choose this behavior?
“Hello, World” was created by Brian Kernighan (pictured above), author of one of the most widely read programming books: C Programming Language, in 1978. He first referenced ‘Hello World’ in the C Programming Language book’s predecessor: A Tutorial Introduction to the Programming Language B published in 1973.
Traditionally, Hello World programs are used to illustrate how the process of coding works, as well as to ensure that a language or system is operating correctly. They are usually the first programs that new coders learn, because even those with little or no experience can execute Hello World both easily and correctly.
The 'Hello world!' program is just a placeholder to explain the most basic syntax of any language. But not all languages use the Hello world program. There is a language called Brainfuck, in which 'Hello world!' is not a good place to begin. The Hello world program of Brainfuck looks like this:
So the use of Hello World actually dates back to the book on the B language. Back then it was used to display the functionality of variables. Since "hi" was too simple there was a need for an option that required more character constants.
The second +
is converting the char
to an int
, and adding it into the string. The ASCII value for /
is 47, which is then converted to a string by the other + operator.
The +
operator before the slash implicitly casts it to an int. See + Operator on MSDN and look at the "unary plus".
The result of a unary + operation on a numeric type is just the value of the operand.
I actually figured this out by looking at what the +
operators were actually calling. (I think this is a ReSharper or VS 2015 feature)
That's because you are using the unary operator +
. It's similar to the unary operator -
, but it doesn't change the sign of the operand, so the only effect it has here is to implicitly convert the character '/'
into an int
.
The value of +'/'
is the character code of /
, which is 47.
The code does the same as:
bool b = "hello" + (int)'/' + "world" == "hello47world";
Why, I hear you ask, is the char
specifically treated to the operator int operator +(int x)
rather than one of the many other fine unary +
operators available?:
char
doesn't have any of those, the compiler looks at the predefined unary +
operators.char
either, so the compiler uses the overload resolution rules to decide which operator (of int
, uint
, long
, ulong
, float
, double
decimal
) is the best.char
.int
beats out long
, float
and double
because you can implicitly convert int
to those types and not back.int
beats uint
and ulong
because... the best conversion rule says it does.How this occurs is an implicit cast ("A char can be implicitly converted to ushort, int, uint, long, ulong, float, double, or decimal." (charMSDN).
The most simple form of the reproduction can be found as
int slash = +'/'; // 47
Char internally is a struct. "Purpose: This is the value class representing a Unicode character" (char.csms referencesource), and the reason the struct can be implicitly cast is because it implements the IConvertible interface.
public struct Char : IComparable, IConvertible
Specifically, with this piece of code
/// <internalonly/>
int IConvertible.ToInt32(IFormatProvider provider) {
return Convert.ToInt32(m_value);
}
The IConvertible
interface states in a comment in code
// The IConvertible interface represents an object that contains a value. This
// interface is implemented by the following types in the System namespace:
// Boolean, Char, SByte, Byte, Int16, UInt16, Int32, UInt32, Int64, UInt64,
// Single, Double, Decimal, DateTime, TimeSpan, and String.
Looking back to the purpose of struct (to be a value representative of a unicode character), it is clear that the intention for this behavior in the language was to provide a way for the value to be converted to supported types. IConvertible
goes on to state
// The implementations of IConvertible provided by the System.XXX value classes
// simply forward to the appropriate Value.ToXXX(YYY) methods (a description of
// the Value class follows below). In cases where a Value.ToXXX(YYY) method
// does not exist (because the particular conversion is not supported), the
// IConvertible implementation should simply throw an InvalidCastException.
Which is explicitly stating that conversions which are not supported throw exceptions. It is also explicitly stated that converting a character to an integer will give the integer value of that character.
The ToInt32(Char) method returns a 32-bit signed integer that represents the UTF-16 encoded code unit of the value argument. Convert.ToInt32 Method (Char)MSDN
All in all, the reasoning for the behavior seems to be self evident. The integer value of the char has meaning as a "UTF-16 encoded code unit". The backslash's value is 47.
As a result of of the value cast present and because char
is a built-in numeric type, the implicit cast to integer from the plus sign is done at compile time. This can be seen with the reuse of the above simple example in a small program (linqpad works to test this)
void Main()
{
int slash = +'/';
Console.WriteLine(slash);
}
Becomes
IL_0000: ldc.i4.s 2F
IL_0002: stloc.0 // slash2
IL_0003: ldloc.0 // slash2
IL_0004: call System.Console.WriteLine
IL_0009: ret
Where the '/'
is simply converted to the hexidecimal value of 2F (47 in decimal) and then used from there.
+ '/'
Gives you the UTF-16 (decimal) 47
character code of the character "/" and @Guffa already explained you why.
As In c# a char is expressed in single quotes i.e. '/' in your case, the + operator in front of char is acting as a unary operator and asks the compiler to provide the UTF value of the char '/' which is 47.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With