Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does "hello" + + '/' + "world" == "hello47world"?

For this C#, a==true:

bool a = "hello" +   '/' + "world" == "hello/world";

And for this C#, b==true:

bool b = "hello" + + '/' + "world" == "hello47world";

I'm wondering how this can be, and more importantly, why did the C# language architects choose this behavior?

like image 354
Contango Avatar asked Aug 01 '15 17:08

Contango


People also ask

What is the origin of Hello World?

“Hello, World” was created by Brian Kernighan (pictured above), author of one of the most widely read programming books: C Programming Language, in 1978. He first referenced ‘Hello World’ in the C Programming Language book’s predecessor: A Tutorial Introduction to the Programming Language B published in 1973.

What is the purpose of Hello World?

Traditionally, Hello World programs are used to illustrate how the process of coding works, as well as to ensure that a language or system is operating correctly. They are usually the first programs that new coders learn, because even those with little or no experience can execute Hello World both easily and correctly.

Do all languages use the Hello World program?

The 'Hello world!' program is just a placeholder to explain the most basic syntax of any language. But not all languages use the Hello world program. There is a language called Brainfuck, in which 'Hello world!' is not a good place to begin. The Hello world program of Brainfuck looks like this:

Why do we use Hello world in Python?

So the use of Hello World actually dates back to the book on the B language. Back then it was used to display the functionality of variables. Since "hi" was too simple there was a need for an option that required more character constants.


6 Answers

The second + is converting the char to an int, and adding it into the string. The ASCII value for / is 47, which is then converted to a string by the other + operator.

The + operator before the slash implicitly casts it to an int. See + Operator on MSDN and look at the "unary plus".

The result of a unary + operation on a numeric type is just the value of the operand.

I actually figured this out by looking at what the + operators were actually calling. (I think this is a ReSharper or VS 2015 feature)

enter image description here

enter image description here

like image 76
Cyral Avatar answered Oct 09 '22 12:10

Cyral


That's because you are using the unary operator +. It's similar to the unary operator -, but it doesn't change the sign of the operand, so the only effect it has here is to implicitly convert the character '/' into an int.

The value of +'/' is the character code of /, which is 47.

The code does the same as:

bool b = "hello" + (int)'/' + "world" == "hello47world";
like image 44
Guffa Avatar answered Oct 09 '22 13:10

Guffa


Why, I hear you ask, is the char specifically treated to the operator int operator +(int x) rather than one of the many other fine unary + operators available?:

  • The unary operator overload resolution rules say to look at user-defined unary operators first, but since char doesn't have any of those, the compiler looks at the predefined unary + operators.
  • Obviously none of those take a char either, so the compiler uses the overload resolution rules to decide which operator (of int, uint, long, ulong, float, double decimal) is the best.
  • Those resolution rules says to look at which is the best function... which pretty much says to look at which argument type offers the best conversion from char.
  • int beats out long, float and double because you can implicitly convert int to those types and not back.
  • int beats uint and ulong because... the best conversion rule says it does.
like image 32
Rawling Avatar answered Oct 09 '22 12:10

Rawling


How this occurs is an implicit cast ("A char can be implicitly converted to ushort, int, uint, long, ulong, float, double, or decimal." (charMSDN).

The most simple form of the reproduction can be found as

int slash = +'/'; // 47

Char internally is a struct. "Purpose: This is the value class representing a Unicode character" (char.csms referencesource), and the reason the struct can be implicitly cast is because it implements the IConvertible interface.

public struct Char : IComparable, IConvertible

Specifically, with this piece of code

/// <internalonly/>
int IConvertible.ToInt32(IFormatProvider provider) {
    return Convert.ToInt32(m_value);
}

The IConvertible interface states in a comment in code

// The IConvertible interface represents an object that contains a value. This
// interface is implemented by the following types in the System namespace:
// Boolean, Char, SByte, Byte, Int16, UInt16, Int32, UInt32, Int64, UInt64,
// Single, Double, Decimal, DateTime, TimeSpan, and String.

Looking back to the purpose of struct (to be a value representative of a unicode character), it is clear that the intention for this behavior in the language was to provide a way for the value to be converted to supported types. IConvertible goes on to state

// The implementations of IConvertible provided by the System.XXX value classes
// simply forward to the appropriate Value.ToXXX(YYY) methods (a description of
// the Value class follows below). In cases where a Value.ToXXX(YYY) method
// does not exist (because the particular conversion is not supported), the
// IConvertible implementation should simply throw an InvalidCastException.

Which is explicitly stating that conversions which are not supported throw exceptions. It is also explicitly stated that converting a character to an integer will give the integer value of that character.

The ToInt32(Char) method returns a 32-bit signed integer that represents the UTF-16 encoded code unit of the value argument. Convert.ToInt32 Method (Char)MSDN

All in all, the reasoning for the behavior seems to be self evident. The integer value of the char has meaning as a "UTF-16 encoded code unit". The backslash's value is 47.

As a result of of the value cast present and because char is a built-in numeric type, the implicit cast to integer from the plus sign is done at compile time. This can be seen with the reuse of the above simple example in a small program (linqpad works to test this)

void Main()
{
    int slash = +'/';
    Console.WriteLine(slash);
}

Becomes

IL_0000:  ldc.i4.s    2F 
IL_0002:  stloc.0     // slash2
IL_0003:  ldloc.0     // slash2
IL_0004:  call        System.Console.WriteLine
IL_0009:  ret    

Where the '/' is simply converted to the hexidecimal value of 2F (47 in decimal) and then used from there.

like image 36
Travis J Avatar answered Oct 09 '22 12:10

Travis J


+ '/' 

Gives you the UTF-16 (decimal) 47 character code of the character "/" and @Guffa already explained you why.

like image 41
a45b Avatar answered Oct 09 '22 13:10

a45b


As In c# a char is expressed in single quotes i.e. '/' in your case, the + operator in front of char is acting as a unary operator and asks the compiler to provide the UTF value of the char '/' which is 47.

like image 20
Krishna Avatar answered Oct 09 '22 11:10

Krishna