Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strange Increment Behaviour in C#

Tags:

Note: Please note that the code below is essentially non-sense, and just for illustration purposes.

Based on the fact that the right-hand side of an assignment must always be evaluated before it's value is assigned to the left-hand side variable, and that increment operations such as ++ and -- are always performed right after evaluation, I would not expect the following code to work:

string[] newArray1 = new[] {"1", "2", "3", "4"}; string[] newArray2 = new string[4];  int IndTmp = 0;  foreach (string TmpString in newArray1) {     newArray2[IndTmp] = newArray1[IndTmp++]; } 

Rather, I would expect newArray1[0] to be assigned to newArray2[1], newArray1[1] to newArray[2] and so on up to the point of throwing a System.IndexOutOfBoundsException . Instead, and to my great surprise, the version that throws the exception is

string[] newArray1 = new[] {"1", "2", "3", "4"}; string[] newArray2 = new string[4];  int IndTmp = 0;  foreach (string TmpString in newArray1) {     newArray2[IndTmp++] = newArray1[IndTmp]; } 

Since, in my understanding, the compiler first evaluates the RHS, assigns it to the LHS and only then increments this is to me an unexpected behaviour. Or is it really expected and I am clearly missing something?

like image 598
User Avatar asked Jul 02 '11 18:07

User


People also ask

What are increment operators in C?

The decrement (–) and increment (++) operators are special types of operators used in programming languages to decrement and increment the value of the given variable by 1 (one), respectively.

Can you increment with ++ in C?

A program can increment by 1 the value of a variable called c using the increment operator, ++, rather than the expression c=c+1 or c+=1.

Why does C have so much undefined behavior?

It exists because of the syntax rules of C where a variable can be declared without init value. Some compilers assign 0 to such variables and some just assign a mem pointer to the variable and leave just like that. if program does not initialize these variables it leads to undefined behavior.

Are two types of increment operators?

Increment/Decrement operators are of two types: Prefix increment/decrement operator. Postfix increment/decrement operator.


1 Answers

ILDasm can be your best friend, sometimes ;-)

I compiled up both your methods and compared the resulting IL (assembly language).

The important detail is in the loop, unsurprisingly. Your first method compiles and runs like this:

Code         Description                  Stack ldloc.1      Load ref to newArray2        newArray2 ldloc.2      Load value of IndTmp         newArray2,0 ldloc.0      Load ref to newArray1        newArray2,0,newArray1 ldloc.2      Load value of IndTmp         newArray2,0,newArray1,0 dup          Duplicate top of stack       newArray2,0,newArray1,0,0 ldc.i4.1     Load 1                       newArray2,0,newArray1,0,0,1 add          Add top 2 values on stack    newArray2,0,newArray1,0,1 stloc.2      Update IndTmp                newArray2,0,newArray1,0     <-- IndTmp is 1 ldelem.ref   Load array element           newArray2,0,"1" stelem.ref   Store array element          <empty>                                                                        <-- newArray2[0] = "1" 

This is repeated for each element in newArray1. The important point is that the location of the element in the source array has been pushed to the stack before IndTmp is incremented.

Compare this to the second method:

Code         Description                  Stack ldloc.1      Load ref to newArray2        newArray2 ldloc.2      Load value of IndTmp         newArray2,0 dup          Duplicate top of stack       newArray2,0,0 ldc.i4.1     Load 1                       newArray2,0,0,1 add          Add top 2 values on stack    newArray2,0,1 stloc.2      Update IndTmp                newArray2,0     <-- IndTmp is 1 ldloc.0      Load ref to newArray1        newArray2,0,newArray1 ldloc.2      Load value of IndTmp         newArray2,0,newArray1,1 ldelem.ref   Load array element           newArray2,0,"2" stelem.ref   Store array element          <empty>                                                                        <-- newArray2[0] = "2" 

Here, IndTmp is incremented before the location of the element in the source array has been pushed to the stack, hence the difference in behaviour (and the subsequent exception).

For completeness, let's compare it with

newArray2[IndTmp] = newArray1[++IndTmp];  Code         Description                  Stack ldloc.1      Load ref to newArray2        newArray2 ldloc.2      Load IndTmp                  newArray2,0 ldloc.0      Load ref to newArray1        newArray2,0,newArray1 ldloc.2      Load IndTmp                  newArray2,0,newArray1,0 ldc.i4.1     Load 1                       newArray2,0,newArray1,0,1 add          Add top 2 values on stack    newArray2,0,newArray1,1 dup          Duplicate top stack entry    newArray2,0,newArray1,1,1 stloc.2      Update IndTmp                newArray2,0,newArray1,1  <-- IndTmp is 1 ldelem.ref   Load array element           newArray2,0,"2" stelem.ref   Store array element          <empty>                                                                        <-- newArray2[0] = "2" 

Here, the result of the increment has been pushed to the stack (and becomes the array index) before IndTmp is updated.

In summary, it seems to be that the target of the assignment is evaluated first, followed by the source.

Thumbs up to the OP for a really thought provoking question!

like image 177
Steve Morgan Avatar answered Nov 27 '22 10:11

Steve Morgan