Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does a Java compiler parse typecasts?

A simple expression like

(x) - y

is interpreted differently depending on whether x is a type name or not. If x is not a type name, (x) - y just subtracts y from x. But if x is a type name, (x) - y computes the negative of y and casts the resulting value to type x.

In a typical C or C++ compiler, the question of whether x is a type or not is answerable because the parser communicates such information to the lexer as soon as it processes a typedef or struct declaration. (I think that such required violation of levels was the nastiest part of the design of C.)

But in Java, x may not be defined until later in the source code. How does a Java compiler disambiguate such an expression?

It's clear that a Java compiler needs multiple passes, since Java doesn't require declaration-before-use. But that seems to imply that the first pass has to do a very sloppy job on parsing expressions, and then in a later pass do another, more accurate, parse of expressions. That seems wasteful.

Is there a better way?

like image 557
Thom Boyer Avatar asked Dec 30 '08 17:12

Thom Boyer


1 Answers

I think I've found the solution that satisfies me. Thanks to mmyers, I realized that I needed to check the formal spec of the syntax for type casts.

The ambiguity is caused by + and - being both unary and binary operators. Java solves the problem by this grammar:

CastExpression:
        ( PrimitiveType Dimsopt ) UnaryExpression
        ( ReferenceType ) UnaryExpressionNotPlusMinus

(see http://java.sun.com/docs/books/jls/third_edition/html/expressions.html#238146)

So, '+' and '-' are explicitly disallowed immediately after the ')' of a cast unless the cast uses a primitive type -- which are known by the compiler a priori.

like image 168
Thom Boyer Avatar answered Nov 03 '22 10:11

Thom Boyer