In the GCC (version 4.8.2) manual, the following is stated:
-ftree-loop-if-convert-stores
:
Attempt to also if-convert conditional jumps containing memory writes. This transformation can be unsafe for multi-threaded programs as it transforms conditional memory writes into unconditional memory writes. For example,for (i = 0; i < N; i++) if (cond) A[i] = expr;
is transformed to
for (i = 0; i < N; i++) A[i] = cond ? expr : A[i];
potentially producing data races.
I wonder, however, if there is a performance gain by using the operator?
versus the if
statement.
A[i]
is set to expr
only if the condition is met. If it is not met, then the code inside the statement is skipped.A[i]
seems to be written regardless of the condition; the condition only affects the value it is set to.By using operator?
, we are also doing a check; however, we are adding some overhead in the case that the condition is not met. Have I missed something?
The conditional operator (? :) is a ternary operator (it takes three operands). The conditional operator works as follows: The first operand is implicitly converted to bool . It is evaluated and all side effects are completed before continuing.
Relational operators often are used with conditional operators to construct more complex decision-making expressions. The Java programming language supports six conditional operators—five binary and one unary—as shown in the following table. One such operator is && , which performs the conditional AND operation.
There are three conditional operators: && the logical AND operator. || the logical OR operator. ?: the ternary operator.
The conditional operator is kind of similar to the if-else statement as it does follow the same algorithm as of if-else statement but the conditional operator takes less space and helps to write the if-else statements in the shortest way possible.
What is says is that conditional jumps are converted to conditional move instructions, the cmove
family of instructions. They improve speed because they do not stall the processor pipeline like jumps do.
With a jump instructions, you don't know in advanced which instructions to load, so a prediction is used and a branch is loaded in the pipeline. If the prediction was correct, all is well, the next instructions are already executing on the pipeline. However, after the jump is evaluated, if the prediction was wrong, all the following instructions already in the pipeline are useless, so the pipeline must be freed, and the correct instructions are loaded. Modern processors contain 16-30 stages of pipe, and a branch mispredictions degrade performance severely. Conditional moves bypass this because they do not insert branches in the program flow.
But does cmove always write?
From Intel x86 Instruction Set Reference:
The CMOVcc instructions check the state of one or more of the status flags in the EFLAGS register [..] and perform a move operation if the flags are in a specified state (or condition). [..] If the condition is not satisfied, a move is not performed and execution continues with the instruction following the CMOVcc instruction.
Edit
Upon further investigating gcc manual, I got confused, because as far as I know the compiler doesn't optimize transforming C code into another C code, but uses internal data structures like Control Flow Graphs so I really don't know what they mean with their example. I suppose they mean the C equivalent of the new flow generated. I am not sure anymore if this optimization is about generating cmoves
.
Edit 2
Since cmove
operates with registers and not memory, this
if (cond)
A[i] = expr
cannot generate cmove
.
However this
A[i] = cond ? expr : A[i];
can.
Suppose we have in bx
the expr
value.
load A[i] into ax
cmp // cond
cmove ax, bx
store ax into &A[i]
So in order to use cmove
you have to read A[i] value and write it back if cond if false, which is not equivalent with the if statement, but with the ternary operator.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With