Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

String assignment in C#

A few weeks ago, I discovered that strings in C# are defined as reference types and not value types. Initially I was confused about this, but then after some reading, I suddenly understood why it is important to store strings on the heap and not the stack - because it would be very inefficient to have a very large string that gets copied over an unpredictable number of stack frames. I completely accept this.

I feel that my understanding is almost complete, but there is one element that I am missing - what language feature do strings use to keep them immutable? To illustrate with a code example:

string valueA = "FirstValue";
string valueB = valueA;
valueA = "AnotherValue";

Assert.AreEqual("FirstValue", valueB); // Passes

I do not understand what language feature makes a copy of valueA when I assign it to valueB. Or perhaps, the reference to valueA does not change when I assign it to valueB, only valueA gets a new reference to itself when I set the string. As this is an instance type, I do not understand why this works.

I understand that you can overload, for example, the == and != operators, but I cannot seem to find any documentation on overloading the = operators. What is the explanation?

like image 814
Steve Rukuts Avatar asked Jun 04 '11 11:06

Steve Rukuts


People also ask

Can you assign a string in C?

C has very little syntactical support for strings. There are no string operators (only char-array and char-pointer operators). You can't assign strings.

How do you assign a string?

String assignment is performed using the = operator and copies the actual bytes of the string from the source operand up to and including the null byte to the variable on the left-hand side, which must be of type string. You can create a new variable of type string by assigning it an expression of type string.

What is string in C with example?

In C programming, a string is a sequence of characters terminated with a null character \0 . For example: char c[] = "c string"; When the compiler encounters a sequence of characters enclosed in the double quotation marks, it appends a null character \0 at the end by default.


3 Answers

what language feature do strings use to keep them immutable?

It is not a language feature. It is the way the class is defined.

For example,

class Integer {
    private readonly int value;

    public int Value { get { return this.value; } }
    public Integer(int value) { this.value = value; } }
    public Integer Add(Integer other) {
        return new Integer(this.value + other.value);
    }
}

is like an int except it's a reference type, but it's immutable. We defined it to be so. We can define it to be mutable too:

class MutableInteger {
    private int value;

    public int Value { get { return this.value; } }
    public MutableInteger(int value) { this.value = value; } }
    public MutableInteger Add(MutableInteger other) {
        this.value = this.value + other.value;
        return this;
    } 
}

See?

I do not understand what language feature makes a copy of valueA when I assign it to valueB.

It doesn't copy the string, it copies the reference. strings are reference type. This means that variables of type strings are storage locations whose values are references. In this case, their values are references to instances of string. When you assign a variable of type string to another of type string, the value is copied. In this case, the value is a reference and it is copied by the assignment. This is true for any reference type, not just string or only immutable reference types.

Or perhaps, the reference to valueA does not change when I assign it to valueB, only valueA gets a new reference to itself when i set the string.

Nope, the values of valueA and valueB refer to the same instance of string. Their values are references, and those values are equal. If you could somehow mutate* the instance of string referred to by valueA, the referrent of both valueA and valueB would see this mutation.

As this is an instance type, I do not understand why this works.

There is no such thing as an instance type.

Basically, strings are reference types. But string are immutable. When you mutate a string, what happens is that you get a reference to a new string that is the result of the mutation to the already existing string.

string s = "hello, world!";
string t = s;
string u = s.ToUpper();

Here, s and t are variables whose values refer to the same instance of string. The referrent of s is not mutated by the call to String.ToUpper. Instead, s.ToUpper makes a mutation of the referrent of s and returns a reference to a new instance of string that it creates in the process of apply the mutation. We assign that reference to u.

I understand that you can overload, for example, the == and != operators, but I cannot seem to find any documentation on overloading the = operators.

You can't overload =.

* You can, with some tricks. Ignore them.

like image 180
jason Avatar answered Oct 20 '22 04:10

jason


First of all, your example will work the same to any reference variables, not just strings.

What happens is:

string valueA = "FirstValue"; //ValueA is referenced to "FirstValue"  
string valueB = valueA; //valueB references to what valueA is referenced to which is "FirstValue"  
valueA = "AnotherValue"; //valueA now references a new value: "AnotherValue"
Assert.AreEqual("FirstValue", valueB); // remember that valueB references "FirstValue"

Now the immutability is a different concept. It means that the value itself can't be changed.
This will show up in a situation like this:

string valueA = "FirstValue"; //ValueA is referenced to "FirstValue"  
string valueB = valueA; //valueB references to what valueA is referenced to which is "FirstValue"  
valueA.Replace('F','B'); //valueA will now be: "BirstValue"
Assert.AreEqual("FirstValue", valueB); // remember that valueB references "FirstValue"

This is because of String's immutability, valueA doesn't change the string itself... It creates a new COPY with the changes and references that.

like image 32
Yochai Timmer Avatar answered Oct 20 '22 04:10

Yochai Timmer


Or perhaps, the reference to valueA does not change when I assign it to valueB, only valueA gets a new reference to itself when i set the string.

That is correct. As strings are immutable, there is no problem having two variables referencing the same string object. When you assign a new string to one of them, it's the reference that is replaced, not the string object.

I cannot seem to find any documentation on overloading the = operators.

That is not due to any shortcoming on your side, it's because there is no way to overload the assignment operator in C#.

The = operator is quite simple, it takes the value on the right hand side and assigns to the variable on the left hand side. If it's a reference type, the value is the reference, so that is what's assigned.

like image 44
Guffa Avatar answered Oct 20 '22 03:10

Guffa