Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why C# using same memory address in this case?

Tags:

c#

.net-core

When I executing this code

class Program
    {
        static void Main(string[] args)
        {
//scope 1 
            {
                string x = "shark";
                string y = x.Substring(0);
                unsafe
                {
                    fixed (char* c = y)
                    {
                        c[4] = 'p';
                    }
                }
                Console.WriteLine(x);
            }
//scope 2
            {
                string x = "shark";
//Why output in this line "sharp" and not "shark" ?
                Console.WriteLine(x);
            }
        }
}

the output is:

sharp
sharp

When I separate this 2 scopes in methods like this:

class Program
    {
        static void Main(string[] args)
        {
            func1();
            func2();
        }
        private static void func2()
        {
            {
                string x = "shark";
                Console.WriteLine(x);
            }
        }

        private static void func1()
        {
            {
                string x = "shark";
                string y = x.Substring(0);
                unsafe
                {
                    fixed (char* c = y)
                    {
                        c[4] = 'p';
                    }
                }
                Console.WriteLine(x);
            }
        }
    }

the output is:

sharp
shark

Edited

I also try this way:

  class Program
    {
        static void Main(string[] args)
        {
            {
                string x = "shark";
                string y = x.Substring(0);
                unsafe
                {
                    fixed (char* c = y)
                    {
                        c[4] = 'p';
                    }
                }
                Console.WriteLine(x);
            }
            void Test(){
                {
                    string x = "shark";
                    Console.WriteLine(x);
                }
            }
            Test();
        }
}

and the output is:

 sharp
 shark

**Environment which I used is MacOS and .net core 2.2 (Rider) **

I expect to have same output in all cases, but the output is different. As we know interning is that all strings you hardcoded are put into assembly and reused globally in the whole application to reuse same memory space. But in case of this code we see that

hardcoded strings reused only in function scope and not in global scope.

Is this a .NET Core bug or this have explanation ?

enter image description here enter image description here

like image 689
hovjan Avatar asked Aug 22 '19 06:08

hovjan


2 Answers

Note the question has changed since i wrote this

If you look at the source.

if( startIndex == 0 && length == this.Length) {
   return this;
}

So when you use Substring(0) you get a reference to the original, then mutating it with unsafe

In the second example Substring(1) is allocating a string.


More in depth analysis.

string x = "shark";
string y = x.Substring(0);
// x reference and y reference are pointing to the same place

// then you mutate the one memory
c[4] = 'p';

// second scope
string x = "shark";
string y = x.Substring(1);
// x reference and y reference are differnt

// you are mutating y
c[0] = 'p';

Edit

The string is interened, and the compiler thinks any literal of "shark" is the same (via a hash). This is why the second part even with different variables produced the mutated result

String interning refers to having a single copy of each unique string in an string intern pool, which is via a hash table in the.NET common language runtime (CLR). Where the key is a hash of the string and the value is a reference to the actual String object

Debugging the second part (without or without scope and different variables)

enter image description here

Edit 2

Scope doesn't matter for me or framework or core, it always produced the same result (the first), it could well be an implementation detail, and losely defined nature of internment in the specs

like image 129
TheGeneral Avatar answered Oct 11 '22 17:10

TheGeneral


As has already been pointed out, this is happening because you are changing the interned string itself, which will change the string for everything that uses that interned string.

It's interesting to note that you do see this changing if you separate out the two methods like so:

using System;

namespace CoreApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            string x = "shark";
            Console.WriteLine("Main: " + x);

            func2(); // If you comment this out, then the  below call to func2() outputs "shark" instead of "sharp"
            func1();
            Console.WriteLine("Main: " + x);

            func2();
        }

        static void func1()
        {
            string x = "shark";
            string y = x.Substring(0);

            unsafe
            {
                fixed (char* c = y)
                {
                    c[4] = 'p';
                }
            }

            Console.WriteLine("func1(): " + x);
        }

        static void func2()
        {
            string x = "shark";
            Console.WriteLine("func2(): " + x);
        }
    }
}

The output of the code above is:

Main: shark
func2(): shark
func1(): sharp
Main: sharp
func2(): sharp

Interestingly, if you comment out the first call to func2(), the output is:

Main: shark
func1(): sharp
Main: sharp
func2(): shark

The reason for the difference is a little harder to explain. I think one would have to look at the actual IL generated to see if anything is being cached.

Note that you can change an interned string without using unsafe code, like so:

using System;
using System.Runtime.InteropServices;

namespace CoreApp1
{
    class Program
    {
        static void Main(string[] args)
        {
            const string test  = "ABCDEF"; // Strings are immutable, right?
            char[]       chars = new StringToChar { str = test }.chr;
            chars[0] = 'X';

            // On an x32 release or debug build or on an x64 debug build, 
            // the following prints "XBCDEF".
            // On an x64 release build, it prints "ABXDEF".
            // In both cases, we have changed the contents of 'test' without using
            // any 'unsafe' code...

            Console.WriteLine(test);

            // The following line is even more disturbing, since the constant
            // string "ABCDEF" has been mutated too (because the interned 'constant' string was mutated).

            Console.WriteLine("ABCDEF");
        }
    }

    [StructLayout(LayoutKind.Explicit)]
    public struct StringToChar
    {
        [FieldOffset(0)] public string str;
        [FieldOffset(0)] public char[] chr;
    }
}

This is, of course, a little surprising, but it's not a bug.

like image 22
Matthew Watson Avatar answered Oct 11 '22 15:10

Matthew Watson