Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# pointers vs. C++ pointers

I have been learning to program and I chose C++ and C# programming as first languages. More specifically, I have an old C book someone was kind enough to let me borrow and I'm using it to learn C#. I use Visual Studio Express and write in C++ and C#. One area that interests me is the ability to do direct memory management. I am trying to learn to use this to optimize my code. However, I am struggling to do it properly and actually see any real performance improvement. For example, here is the following code in C#:

unsafe static void Main(string[] args)
{
    int size = 300000;
    char[] numbers = new char[size];

    for (int i = 0; i < size; i++)
    {
        numbers[i] = '7';
    }

    DateTime start = DateTime.Now;

    fixed (char* c = &numbers[0])
    {
        for (int i = 0; i < 10000000; i++)
        {
            int number = myFunction(c, 100000);
        }
    }

    /*char[] c = numbers;  // commented out C# non-pointer version same 
          speed as C# pointer version
    {
        for (int i = 0; i < 10000000; i++)
        {
            int number = myFunction(c, 100000);
        }
    }*/

    TimeSpan timeSpan = DateTime.Now - start;
    Console.WriteLine(timeSpan.TotalMilliseconds.ToString());
    Console.ReadLine();
}

static int myFunction(ref char[] numbers, int size)
{
    return size * 100;
}

static int myFunction(char[] numbers, int size)
{
    return size * 100;
}

unsafe static int myFunction(char* numbers, int size)
{
    return size * 100;
}

No matter which of three methods I call, I am getting the same execution speed. I'm also still trying to wrap my head around the difference between using ref and using a pointer, except that's probably something that will take time and practice.

What I don't understand, however, is that I am able to produce a very significant performance difference in C++. Here is what I came up with when I attempted to approximate the same code in C++:

/*int myFunction(std::string* numbers, int size)  // C++ pointer version commented 
     out is much faster than C++ non-pointer version
{
    return size * 100;
}*/

int myFunction(std::string numbers, int size) // value version
{
    return size * 100;
}

int _tmain(int argc, _TCHAR* argv[])
{
int size = 100000;
std::string numbers = "";
for (int i = 0; i < size; i++)
{
    numbers += "777";
}

clock_t start = clock();

for (int i = 0; i < 10000; i++)
{
    int number = myFunction(numbers, 100000);
}

clock_t timeSpan = clock() - start;

std::cout << timeSpan;
char c;
std::cin >> c;

return 0;
}

Can anyone tell me why my C# code isn't benefitting from my use of references or pointers? I've been reading stuff online and whatnot, except I'm stuck.

like image 379
ostrichchasedwormaroundfield Avatar asked Jan 11 '23 15:01

ostrichchasedwormaroundfield


2 Answers

C# already generates pointers without you explicitly declaring them. Every reference type reference, like your numbers variable, is in fact a pointer at runtime. Every argument you pass with the ref or out keywords are in fact pointers at runtime. The exact C equivalent of your array argument is char**, char*& in C++. There's no difference in C#.

So you don't see any difference in speed because the code that actually executes is the same.

That isn't exactly where it stops either, you never actually do anything with the array. The method you call disappears at runtime, much like it does in a C or C++ compiler, it will be inlined by the optimizer. And since you don't use the array argument, you don't get any code for it either.

Pointers become useful to speed programs up when you use them to actually address memory. You can index the array and be sure that you'll never pay for the array bounds check. You won't pay for it in normal usage in many cases either, the jitter optimizer is fairly smart about removing the checks if it knows that the indexing is always safe. That's unsafe usage of a pointer, you can readily scribble into parts of memory that don't belong to the array and corrupt the GC heap that way. The pointers used for an object reference or a ref argument are never unsafe.

The only real way to see any of this is to look at the generated machine code. Debug + Windows + Disassembly window. It is important that allow code to still be optimized even though you debug it or you can't see the optimizations. Be sure to run the Release build and use Tools + Options, Debugging, General, untick the "Suppress JIT optimization on module load" option. Some familiarity with machine code is required to make sense of what you see.

like image 53
Hans Passant Avatar answered Jan 13 '23 03:01

Hans Passant


The problem is that you aren't measuring what you think you're measuring. I can read your code and see immediately why you would get this result or that result, and it's not just because of pointers or not pointers. There are lots of other factors at play, or potentially at play. The various comments reflect this.

For what it's worth, the main reason one C++ call is much slower than the other is because the slow one copies a std::string and the fast one does not. The C# examples do not have anything like that order of difference between them.

My suggestion is that, as a bright but early stage programmer you focus first on getting to be a better programmer. Don't worry about "optimising" until you know what you're trying to achieve.

When you're ready to really understand this problem, you will have to study the generated code. In the case of C# that's MSIL, together with whatever it JITs into on the particular platform. In the case of C++ that's Intel opcodes for whatever processor. Until you know what MSIL and JIT and opcodes are, understanding exactly why you get the results you do will be hard to explain.

like image 39
david.pfx Avatar answered Jan 13 '23 04:01

david.pfx