Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding more about i++ and i=i+1

I was wondering if there is difference between the two forms of increment. Some of the links says i++ is faster that i=i+1;

Also as one of the person my observation is also the same for assembly code. please check the image where both the assembly code are same for i++ and i=i+1 - enter image description here

There is another link that says that it used to be true previously that increment operator was faster than addition and assignment, but now compilers optimize i++ and i=i+1 the same.

Is there any official document/paper which we can refer to confirm what is exactly right ? (I usually go with credit's and number of accepted answers of a person on stackoverflow. Could't find any such thing on the links I provided).

like image 208
Sagrian Avatar asked Oct 05 '14 04:10

Sagrian


3 Answers

Actually, if you do C++, it is much better to get used to write ++i instead. The reason is simple: i++ requires a copy.

a = ++i; // a is set to the result of i+1
a = i++; // make a copy of i, compute i+1, save the copy of i in a

Without optimizations, the assembly code would look like this:

a = ++i;                            a = i++;

MOV eax, (i)                        MOV eax, (i)
                                    PUSH eax
ADD eax, 1                          ADD eax, 1
MOV (i), eax                        MOV (i), eax
                                    POP eax
MOV (a), eax                        MOV (a), eax

Now, with optimizations, the result is the same in C where the ++ operator only applies to integers and pointers.

The ++ and -- are there because most processors had an INC and a DEC instruction at the time C was written. So if you were to use an index register, these instructions would be applied:

char a[256];
...init 'a' in some way...
int sum =0;
for(int i = 0; i < 100; ++i)
{
    sum += a[i];
}

This could be done with a simple INC as in (6502):

    LDA #00
    LDY #00
LOOP:
    CLC
    ADC ($80),Y
    INY              <-- ++i or i++
    CPY #100
    BCC LOOP

Note that in C we have another notation to increment a variable:

i += 1;

This is practical if you need to increment the register by more than 1:

i += 3;

Or to double the register each time:

i += i;   // (equivalent to  i *= 2;  or  i <<= 1;  in C++)

Question: Why is INC and DEC not used with all 80x86?

There has been time when the ADD reg, 1 and SUB reg, 1 instructions were faster than the INC reg and DEC reg. In the old days, it was faster because the instruction was smaller and we had no cache (or very little). Today, either instruction is probably about the same.

From a comment below, a reason for the "slowness" was the FLAGS register:

Intel Optimization Reference, section 3.5.1.1 Use of the INC and DEC Instructions

From another, more current comment, it looks like the slowness referenced in that Intel document has been fixed in newer processors. So the use or non-use of these instructions should depend on the target processor if known in advance.


As pointed out by phresnel in a comment, the difference between i++ and ++i was probably not clear for many people. For an integer, the optimization is really trivial and will most certainly happen even with -O0. However, in C++, that's a different story. There is a class with both increment operators clearly showing that a copy is required for i++ (even if data_ is just an integer, although in that case you could also do: return data_++ -- it still requires a well hidden copy!):

class A
{
public:
    A& operator ++ () // ++i -- no copy
    {
        ...apply the ++ operation to 'data_'...
        return *this;    // return a reference to this
    }

    A  operator ++ (int) // i++ -- needs a temporary copy
    {
        // remember that the 'int' is totally ignored in the function,
        // its only purpose is to distinguish '++i' from 'i++'

        A copy = *this;    // here we need a copy
        ++*this;
        return copy;       // and here we return said copy
    }

private:
    some_type_t   data_;
};

Note that modern C++ compilers do not make two copies in the i++ function as the returned value can be optimized out without the need of an extra copy.

The difference between both cases can be shown as clearly slower if using i++ as described in Is there a performance difference between i++ and ++i in C++? (link mentioned by phresnel)

like image 54
Alexis Wilke Avatar answered Oct 20 '22 20:10

Alexis Wilke


There is no official document. The c spec does not declare that i++ must be faster than i+1 so compilers/optimizers are free to do what they like (and they can make different choices based on surrounding code and optimization level).

I use i++ because it's faster for me to read, with fewer characters to mistype.

like image 26
Ian Avatar answered Oct 20 '22 19:10

Ian


Runthe code profiler between both of them it will be same for both i++ and i = i+1 for current version of gcc compiler. This purely depends on the compiler optimizations.

If you speak about processor specifically you can see the below chart for machine cycles,

INC - Increment

    Usage:  INC     dest
    Modifies flags: AF OF PF SF ZF

    Adds one to destination unsigned binary operand.

                             Clocks                 Size
    Operands         808x  286   386   486          Bytes

    reg8              3     2     2     1             2
    reg16             3     2     2     1             1
    reg32             3     2     2     1             1
    mem             15+EA   7     6     3            2-4  (W88=23+EA)

ADD - Arithmetic Addition

    Usage:  ADD     dest,src
    Modifies flags: AF CF OF PF SF ZF

    Adds "src" to "dest" and replacing the original contents of "dest".
    Both operands are binary.

                             Clocks                 Size
    Operands         808x  286   386   486          Bytes

    reg,reg           3     2     2     1             2
    mem,reg         16+EA   7     7     3            2-4  (W88=24+EA)
    reg,mem          9+EA   7     6     2            2-4  (W88=13+EA)
    reg,immed         4     3     2     1            3-4
    mem,immed       17+EA   7     7     3            3-6  (W88=23+EA)
    accum,immed       4     3     2     1            2-3

You can find the machine cycles taken for each instruction ADD and INC for various processors, 8086, 80286, 80386, 80486,above, you can find this documented in intel processor manuals. Lower the machine cycles faster the response.

like image 26
arahan567 Avatar answered Oct 20 '22 19:10

arahan567