Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reading a String as an int pointer

Tags:

c#

Alright, so this all started with my interest in hash codes. After doing some reading from a Jon Skeet post I asked this question. That got me really interested in pointer arithmetic, something I have almost no experience in. So, after reading through this page I began experimenting as I got a rudimentary understanding from there and my other fantastic peers here on SO!

Now I'm doing some more experimenting, and I believe I've accurately duplicated the hash code loop that's in the string implementation below (I reserve the right to be wrong about that):

Console.WriteLine("Iterating STRING (2) as INT ({0})", sizeof(int));
Console.WriteLine();

var val = "Hello World!";
unsafe
{
    fixed (char* src = val)
    {
        var ptr = (int*)src;
        var len = val.Length;
        while (len > 2)
        {
            Console.WriteLine((char)*ptr);
            Console.WriteLine((char)ptr[1]);

            ptr += 2;
            len -= sizeof(int);
        }

        if (len > 0)
        {
            Console.WriteLine((char)*ptr);
        }
    }
}

But, the results are a bit perplexing to me; kind of. Here are the results:

Iterating STRING (2) as INT (4)

H
l
o
W
r
d

I thought, originally, the value at ptr[1] would be the second letter that is read (or squished together) with the first. However, it's clearly not. Is that because ptr[1] is technically byte 4 on the first iteration and byte 12 on the second iteration?

like image 406
Mike Perrenoud Avatar asked Nov 30 '22 11:11

Mike Perrenoud


1 Answers

Your problem is that you're casting the pointer to an int* pointer.. which is 32 bits.. not 16 like the char*.

Therefore, each increment is 32 bits. Here's a picture (praise my artwork if you must):

char* int*Sorry about the dodgy arrows.. I think my mouse batteries are dying

When you're reading via a char pointer.. you're reading character by character at 16 bits.

When you cast it to an int pointer.. you're reading at 32-bit increments. That means, ptr[0] is both H and e (but points at the base of the H). ptr[1] is both l's..

That is why you are essentially skipping a character in your output.

When you cast it back to a char here:

Console.WriteLine((char)*ptr);

..only the first 16 bits will result from that conversion, which is the first character in each pair.

like image 93
Simon Whitehead Avatar answered Dec 06 '22 04:12

Simon Whitehead