The precedence tables in many Ruby documentations out there list binary arithmetic operations as having higher precedence than their corresponding compound assignment operators. This leads me to believe that code like this shouldn't be valid Ruby code, yet it is. <pre class="prettyprint"><code>1 + age *= 2 </code></pre> If the precedence rules were correct, I'd expect that the above code would be parenthesized like this: <pre class="prettyprint"><code>((1 + age) *= 2) #ERROR: Doesn't compile </code></pre> But it doesn't. So what gives?

Checking <code>ruby -y</code> output, you can see exactly what is happening. Given the source of <code>1 + age *= 2</code>, the output suggests this happens (simplified): <code>tINTEGER</code> found, recognised as <code>simple_numeric</code>, which is a <code>numeric</code>, which is a <code>literal</code>, which is a <code>primary</code>. Knowing that <code>+</code> comes next, <code>primary</code> is recognised as <code>arg</code>. <code>+</code> found. Can't deal yet. <code>tIDENTIFIER</code> found. Knowing that next token is <code>tOP_ASGN</code> (operator-assignment), <code>tIDENTIFIER</code> is recognised as <code>user_variable</code>, and then as <code>var_lhs</code>. <code>tOP_ASGN</code> found. Can't deal yet. <code>tINTEGER</code> found. Same as last one, it is ultimately recognised as <code>primary</code>. Knowing that next token is <code>\n</code>, <code>primary</code> is recognised as <code>arg</code>. At this moment we have <code>arg + var_lhs tOP_ASGN arg</code> on stack. In this context, we recognise the last <code>arg</code> as <code>arg_rhs</code>. We can now pop <code>var_lhs tOP_ASGN arg_rhs</code> from stack and recognise it as <code>arg</code>, with stack ending up as <code>arg + arg</code>, which can be reduced to <code>arg</code>. <code>arg</code> is then recognised as <code>expr</code>, <code>stmt</code>, <code>top_stmt</code>, <code>top_stmts</code>. <code>\n</code> is recognised as <code>term</code>, then <code>terms</code>, then <code>opt_terms</code>. <code>top_stmts opt_terms</code> are recognised as <code>top_compstmt</code>, and ultimately <code>program</code>. <hr> On the other hand, given the source <code>1 + age * 2</code>, this happens: <code>tINTEGER</code> found, recognised as <code>simple_numeric</code>, which is a <code>numeric</code>, which is a <code>literal</code>, which is a <code>primary</code>. Knowing that <code>+</code> comes next, <code>primary</code> is recognised as <code>arg</code>. <code>+</code> found. Can't deal yet. <code>tIDENTIFIER</code> found. Knowing that next token is <code>*</code>, <code>tIDENTIFIER</code> is recognised as <code>user_variable</code>, then <code>var_ref</code>, then <code>primary</code>, and <code>arg</code>. <code>*</code> found. Can't deal yet. <code>tINTEGER</code> found. Same as last one, it is ultimately recognised as <code>primary</code>. Knowing that next token is <code>\n</code>, <code>primary</code> is recognised as <code>arg</code>. The stack is now <code>arg + arg * arg</code>. <code>arg * arg</code> can be reduced to <code>arg</code>, and the resultant <code>arg + arg</code> can also be reduced to <code>arg</code>. <code>arg</code> is then recognised as <code>expr</code>, <code>stmt</code>, <code>top_stmt</code>, <code>top_stmts</code>. <code>\n</code> is recognised as <code>term</code>, then <code>terms</code>, then <code>opt_terms</code>. <code>top_stmts opt_terms</code> are recognised as <code>top_compstmt</code>, and ultimately <code>program</code>. <hr> What's the critical difference? In the first piece of code, <code>age</code> (a <code>tIDENTIFIER</code>) is recognised as <code>var_lhs</code> (left-hand-side of assignment), but in the second one, it's <code>var_ref</code> (a variable reference). Why? Because Bison is a LALR(1) parser, meaning that it has one-token look-ahead. So <code>age</code> is <code>var_lhs</code> because Ruby saw <code>tOP_ASGN</code> coming up; and it was <code>var_ref</code> when it saw <code>*</code>. This comes about because Ruby knows (using the huge state transition table that Bison generates) that one specific production is impossible. Specifically, at this time, the stack is <code>arg + tIDENTIFIER</code>, and next token is <code>*=</code>. If <code>tIDENTIFIER</code> is recognised as <code>var_ref</code> (which leads up to <code>arg</code>), and <code>arg + arg</code> reduced to <code>arg</code>, then there is no rule that starts with <code>arg tOP_ASGN</code>; thus, <code>tIDENTIFIER</code> cannot be allowed to become <code>var_ref</code>, and we look at the next matching rule (the <code>var_lhs</code> one). So Aleksei is partly right in that there is some truth to "when it sees a syntax error, it tries another way", but it is limited to one token into future, and the "attempt" is just a lookup in the state table. Ruby is incapable of complex repair strategies we humans use to understand sentences like "the horse raced past the barn fell", where we happily parse till the last word, then reevaluate the whole sentence when the first parse turns out impossible. tl;dr: The precedence table is not exactly correct. There is no place in Ruby source where it exists; rather, it is the result of the interplay of various parsing rules. Many of the precedence rules break in when left-hand-side of an assignment is introduced.

Why is a statement like 1 + n *= 3 allowed in Ruby?

Q: What is << operator in Ruby?

As a general convention, << in Ruby means "append", i.e. it appends its argument to its receiver and then returns the receiver. So, for Array it appends the argument to the array, for String it performs string concatenation, for Set it adds the argument to the set, for IO it writes to the file descriptor, and so on.

Q: What operators are available in Ruby?

Ruby Arithmetic OperatorsAddition − Adds values on either side of the operator. Subtraction − Subtracts right hand operand from left hand operand. Multiplication − Multiplies values on either side of the operator. Division − Divides left hand operand by right hand operand.

Tags:

The precedence tables in many Ruby documentations out there list binary arithmetic operations as having higher precedence than their corresponding compound assignment operators. This leads me to believe that code like this shouldn't be valid Ruby code, yet it is.

1 + age *= 2

If the precedence rules were correct, I'd expect that the above code would be parenthesized like this:

((1 + age) *= 2) #ERROR: Doesn't compile

But it doesn't.

So what gives?

232

asked Aug 26 '19 08:08

No Ordinary Love

1 Answers

Checking ruby -y output, you can see exactly what is happening. Given the source of 1 + age *= 2, the output suggests this happens (simplified):

tINTEGER found, recognised as simple_numeric, which is a numeric, which is a literal, which is a primary. Knowing that + comes next, primary is recognised as arg.

+ found. Can't deal yet.

tIDENTIFIER found. Knowing that next token is tOP_ASGN (operator-assignment), tIDENTIFIER is recognised as user_variable, and then as var_lhs.

tOP_ASGN found. Can't deal yet.

tINTEGER found. Same as last one, it is ultimately recognised as primary. Knowing that next token is \n, primary is recognised as arg.

At this moment we have arg + var_lhs tOP_ASGN arg on stack. In this context, we recognise the last arg as arg_rhs. We can now pop var_lhs tOP_ASGN arg_rhs from stack and recognise it as arg, with stack ending up as arg + arg, which can be reduced to arg.

arg is then recognised as expr, stmt, top_stmt, top_stmts. \n is recognised as term, then terms, then opt_terms. top_stmts opt_terms are recognised as top_compstmt, and ultimately program.

On the other hand, given the source 1 + age * 2, this happens:

tINTEGER found, recognised as simple_numeric, which is a numeric, which is a literal, which is a primary. Knowing that + comes next, primary is recognised as arg.

+ found. Can't deal yet.

tIDENTIFIER found. Knowing that next token is *, tIDENTIFIER is recognised as user_variable, then var_ref, then primary, and arg.

* found. Can't deal yet.

tINTEGER found. Same as last one, it is ultimately recognised as primary. Knowing that next token is \n, primary is recognised as arg.

The stack is now arg + arg * arg. arg * arg can be reduced to arg, and the resultant arg + arg can also be reduced to arg.

What's the critical difference? In the first piece of code, age (a tIDENTIFIER) is recognised as var_lhs (left-hand-side of assignment), but in the second one, it's var_ref (a variable reference). Why? Because Bison is a LALR(1) parser, meaning that it has one-token look-ahead. So age is var_lhs because Ruby saw tOP_ASGN coming up; and it was var_ref when it saw *. This comes about because Ruby knows (using the huge state transition table that Bison generates) that one specific production is impossible. Specifically, at this time, the stack is arg + tIDENTIFIER, and next token is *=. If tIDENTIFIER is recognised as var_ref (which leads up to arg), and arg + arg reduced to arg, then there is no rule that starts with arg tOP_ASGN; thus, tIDENTIFIER cannot be allowed to become var_ref, and we look at the next matching rule (the var_lhs one).

So Aleksei is partly right in that there is some truth to "when it sees a syntax error, it tries another way", but it is limited to one token into future, and the "attempt" is just a lookup in the state table. Ruby is incapable of complex repair strategies we humans use to understand sentences like "the horse raced past the barn fell", where we happily parse till the last word, then reevaluate the whole sentence when the first parse turns out impossible.

tl;dr: The precedence table is not exactly correct. There is no place in Ruby source where it exists; rather, it is the result of the interplay of various parsing rules. Many of the precedence rules break in when left-hand-side of an assignment is introduced.

answered Dec 10 '22 20:12

Amadan

Related questions
                            
                                How do I enable Ivy for Angular 8 or 9?
                            
                                How can I get rid of the warning import/no-anonymous-default-export in React?
                            
                                The build was cancelled because another Xamarin operation is running. Please try again in a moment
                            
                                Xcode 13b3 : No preview available (no bundle)
                            
                                How do you get double-underscores to display in markdown?
                            
                                In Java, do I need to declare my collection synchronized if it's read-only?
                            
                                Django Forms - How to Use Prefix Parameter
                            
                                How to submit RESTful partial updates?
                            
                                Variadic templates
                            
                                IRC channel for iPhone developers? [closed]
                            
                                Change default date time format on a single database in SQL Server
                            
                                What is required to send messages via USSD? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With