Consider the following code:
unsafe
{
string foo = string.Copy("This can't change");
fixed (char* ptr = foo)
{
char* pFoo = ptr;
pFoo[8] = pFoo[9] = ' ';
}
Console.WriteLine(foo); // "This can change"
}
This creates a pointer to the first character of foo
, reassigns it to become mutable, and changes the chars 8 and 9 positions up to ' '
.
Notice I never actually reassigned foo
; instead, I changed its value by modifying its state, or mutating the string. Therefore, .NET strings are mutable.
This works so well, in fact, that the following code:
unsafe
{
string bar = "Watch this";
fixed (char* p = bar)
{
char* pBar = p;
pBar[0] = 'C';
}
string baz = "Watch this";
Console.WriteLine(baz); // Unrelated, right?
}
will print "Catch this"
due to string literal interning.
This has plenty of applicable uses, for example this:
string GetForInputData(byte[] inputData)
{
// allocate a mutable buffer...
char[] buffer = new char[inputData.Length];
// fill the buffer with input data
// ...and a string to return
return new string(buffer);
}
gets replaced by:
string GetForInputData(byte[] inputData)
{
// allocate a string to return
string result = new string('\0', inputData.Length);
fixed (char* ptr = result)
{
// fill the result with input data
}
return result; // return it
}
This could save potentially huge memory allocation / performance costs if you work in a speed-critical field (e.g. encodings).
I guess you could say that this doesn't count because it "uses a hack" to make pointers mutable, but then again it was the C# language designers who supported assigning a string to a pointer in the first place. (In fact, this is done all the time internally in String
and StringBuilder
, so technically you could make your own StringBuilder with this.)
So, should .NET strings really be considered immutable?
§ 18.6 of the C# language specification (The fixed
statement) specifically addresses the case of modifying a string through a fixed pointer, and indicates that doing so can result in undefined behavior:
Modifying objects of managed type through fixed pointers can results in undefined behavior. For example, because strings are immutable, it is the programmer’s responsibility to ensure that the characters referenced by a pointer to a fixed string are not modified.
I just had to play with this and experiment to confirm whether the addresses of string literal are pointing into the same memory location.
The results are:
string foo = "Fix value?"; //New address: 0x02b215f8
string foo2 = "Fix value?"; //Points to same address: 0x02b215f8
string fooCopy = string.Copy(foo); //New address: 0x021b2888
fixed (char* p = foo)
{
p[9] = '!';
}
Console.WriteLine(foo);
Console.WriteLine(foo2);
Console.WriteLine(fooCopy);
//Reference is equal, which means refering to same memory address
Console.WriteLine(string.ReferenceEquals(foo, foo2)); //true
//Reference is not equal, which creates another string in new memory address
Console.WriteLine(string.ReferenceEquals(foo, fooCopy)); //false
We see that foo
initializes a string literal which points to 0x02b215f8
memory address in my PC. Assigning the same string literal to foo2
references the same memory address. And creating a copy of that same string literal makes a new one. Further testing via string.ReferenceEquals()
reveals that they are indeed equal for foo
and foo2
while different reference for foo
and fooCopy
.
It is interesting to see how string literals can be manipulated in memory and affects other variables that are just referencing it. One of the things that we should be careful of as this behavior exists.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With