Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Do the multiple postfix-expression(subscripting) evaluations result in UB

#include <iostream>
int main(){
   int  arr[7] = {0,1,2,3,4,3,2};
   arr[0]++[arr]++[arr]++[arr]++[arr]++[arr]++[arr] = 5;  //#1
   for(auto i = 0;i<7;i++){
       std::cout<<i<<" : "<< arr[i]<<std::endl;
   }
}

Consider the above code, Is this evaluation at #1 would result in UB? This is a example I have saw in twitter.

According to the evaluation sequence for postfix ++:
expr.post.incr#1

The value computation of the ++ expression is sequenced before the modification of the operand object.

That means, such a example would result in UB

int arr[2] = {0};
(*(arr[0]++ + arr))++

Because, the side effect caused by expression arr[0]++ and (*(arr[0]++) + arr))++ are unsequenced and applied to the same memory location.

However, for the first example, It's difference. My argument is:
expr.sub#1

The expression E1[E2] is identical (by definition) to *((E1)+(E2)),..., The expression E1 is sequenced before the expression E2.

That means, every value computation and side effect associated with E1 are both sequenced before every value computation and side effect associated with E2.

To simplify the expression at #1, according to the grammar of expression, such a expression should conform to: expr.ass

logical-or-expression assignment-operator initializer-clause

And expr.post#1

Where the logical-or-expression here is a postfix-expression. That is,

postfix-expression [ arr ] = 5;

Where the postfix-expression has the form postfix-expression ++, which in turn, the postfix-expression has the form postfix-expression[arr]. In simple, the left operand of assignment consists of two kinds of postfix-expressions, they alternate combination with each other.

Postfix expressions group left-to-right

So, let the subscript operation has the form E1[E2] and the postfix++ expression has the form PE++, then for the first example, it will give a following decomposition like this:

E1': arr[0]++ 
E2': arr
E1'[E2']: arr[0]++[arr]
PE'++ : E1'[E2']++

E1'': PE'++
E2'': arr
E1''[E2'']: PE'++[arr]
PE''++: E1''[E2''] ++

and so on...

That means, in order to caculate PE'++, E1'[E2'] should be calculated prior PE'++, which is identical to *((E1')+E2'), per the rule E1' is sequenced before E2', hence the side effect caused by E1' is sequenced before value computation for E2'.
In other words, each side effect caused by postfix++ expression must be evaluated prior to that expression combined with the subsequent [arr].

So, in this way, I think such a code at #1 should have well-defined behavior rather than UB. Is there anything I misunderstand? Whether the code is UB or not? If it's not UB, what's the correct result the code will give?

like image 715
xmh0511 Avatar asked Nov 30 '20 12:11

xmh0511


People also ask

What is the use of postfix expression?

Postfix notation, also known as RPN, is very easy to process left-to-right. An operand is pushed onto a stack; an operator pops its operand(s) from the stack and pushes the result. Little or no parsing is necessary. It's used by Forth and by some calculators (HP calculators are noted for using RPN).

What is meant by postfix expression in stack?

A postfix expression is a collection of operators and operands in which the operator is placed after the operands. That means, in a postfix expression the operator follows the operands.

How to evaluate postfix expressions?

In this post, evaluation of postfix expressions is discussed. Following is an algorithm for evaluation postfix expressions. 1) Create a stack to store operands (or values). 2) Scan the given expression and do the following for every scanned element. …..b) If the element is an operator, pop operands for the operator from the stack.

What happens to the topmost operands in a postfix expression?

In postfix expression the topmost operands are popped off and operator is applied and the result is again pushed to the stack and thus finally the stack contains a single value at the end of the process.  (B).

How to solve postfix expressions using stack data structure?

Here also we have to use the stack data structure to solve the postfix expressions. From the postfix expression, when some operands are found, pushed them in the stack. When some operator is found, two items are popped from the stack and the operation is performed in correct sequence.

What is the use of postfix notation in Algebra?

The Postfix notation is used to represent algebraic expressions. The expressions written in postfix form are evaluated faster compared to infix notation as parenthesis are not required in postfix.


2 Answers

I believe your understanding is fine and the code is fine in C++ from C++17.

Is there anything I misunderstand?

No.

Whether the code is UB or not?

No.

If it's not UB, what's the result?

arr[0]++[arr]++[arr]++[arr]++[arr]++[arr]++[arr] = 5;
Side effect: arr[0] := 0 + 1 = 1
       0[arr]++[arr]++[arr]++[arr]++[arr]++[arr] = 5;
Side effect: arr[0] := 1 + 1 = 2
              1[arr]++[arr]++[arr]++[arr]++[arr] = 5;
Side effect: arr[1] := 1 + 1 = 2
                     1[arr]++[arr]++[arr]++[arr] = 5;
Side effect: arr[1] := 2 + 1 = 3
                            2[arr]++[arr]++[arr] = 5;
Side effect: arr[2] := 2 + 1 = 3
                                   2[arr]++[arr] = 5;
Side effect: arr[2] := 3 + 1 = 4
                                          3[arr] = 5;
Side effect: arr[3] := 5

I see the output would be:

0 : 2
1 : 3
2 : 4
3 : 5
4 : 4
5 : 3
6 : 2

Note that the part The expression E1 is sequenced before the expression E2 was added in C++17.

The code is undefined before C++17 and is undefined in C (the tweet was about C code), because in arr[0]++[arr]++ both side effects on arr[0] from ++ are unsequenced to each other.

like image 166
KamilCuk Avatar answered Oct 28 '22 09:10

KamilCuk


The multiply postfix incremented expression is not UB by itself because the standard mandates that:

The value computation of the ++ expression is sequenced before the modification of the operand object.

So in each x++[arr], the postfix incrementation is defered after the value computation and will end into arr[0][arr][arr][arr][arr][arr][arr] and finaly (as arr[0] is initialy 0) into arr[0].

Then you will have a number of incrementation to apply here to the same arr[0] element, and as demonstrated by KalilCuk, all will be fine on that part alone at least in C++17. As arr[0] is 0 all those post incrementation will apply on arr[0], still without UB in that specific user case even before C++17.

The problem is that the assignment is not a sequence point. So a compiler can choose to first apply the assignment and then increment the result (which would give 11), or to first increment the value (which will become 6) and then process the assignment with a final result of 5.

So you are invoking the same UB as the simpler:

i = 0;
i++ = 1;     // 1 or 2 ?

Probably not the one you expected, but still UB...

like image 26
Serge Ballesta Avatar answered Oct 28 '22 09:10

Serge Ballesta