Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do strings look from the compiler's point of view?

In C, the compiler has a pointer to the start of the string and has an end-symbol ('\0'). If a user wants to calculate the length of the string, the compiler has to count elements of the string array until it finds '\0'.

In UCSD-strings, the compiler has the length of the string in the first symbols.

And what does the compiler think about C#-strings? Yes, from the user's point of view String is an object that has a field Length, I'm not talking about high-level stuff. I want to know deep algorithms; e.g., how does the compiler calculate the length of the string?

like image 720
homk Avatar asked Oct 04 '15 18:10

homk


2 Answers

Let's execute the following code:

string s = "123";
string s2 = "234";
string s3 = s + s2;
string s4 = s2 + s3;
Console.WriteLine(s + s2);

Now let's put a breakpoint at the last line and open the memory window:

Strings

Writing s3 in the memory window we can see the 2 (s3 and s4) strings allocated one after the other with 4 bytes of size at the beginning.

Also you can see that other memory is allocated such as the strings class type token and other string class data.

The string class itself contains a member private int m_stringLength; which contains the length of the string, this also makes string.Concat() execute super fast (by allocating the whole length at the beginning):

int totalLength = str0.Length + str1.Length + str2.Length;

String result = FastAllocateString(totalLength);
FillStringChecked(result, 0, str0);
FillStringChecked(result, str0.Length, str1);
FillStringChecked(result, str0.Length + str1.Length, str2);

What I find a little strange is that the implementation of IEnumerable<char>.Count() for string is done using the default implementation which means iterating items one by one unlike ICollection<T>s like List<T> where the IEnumerable<char>.Count() is implemented by taking its ICollection<T>.Count property.

like image 61
Tamir Vered Avatar answered Nov 08 '22 16:11

Tamir Vered


In C# the length of the string is stored in the object in a private field ([NonSerialized]private int m_stringLength;), it doesn't have to be calculated at run-time.

The source code of String class is available online.

like image 7
Jakub Lortz Avatar answered Nov 08 '22 16:11

Jakub Lortz