Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is `try` an explicit keyword?

In all exception-aware languages I know (C++, Java, C#, Python, Delphi-Pascal, PHP), catching exceptions requires an explicit try block followed by catch blocks. I was often wondering what the technical reason for that is. Why couldn't we just append catch clauses to an otherwise ordinary block of code? As a C++ example, why do we have to write this:

int main()
{
  int i = 0;
  try {
    i = foo();
  }
  catch (std::exception& e)
  {
    i = -1;
  }
}

instead of this:

int main()
{
  int i = 0;
  {
    i = foo();
  }
  catch (std::exception& e)
  {
    i = -1;
  }
}

Is there an implementation reason for this, or is it just "somebody first designed it that way and now everyone is just familiar with it and copies it?"

The way I see it, it makes no sense for compiled languages - the compiler sees the entire source code tree before generating any code, so it could easily insert the try keyword in front of a block on the fly when a catch clause follows that block (if it needs to generate special code for try blocks in the first place). I could imagine some use in interpreted languages which do no parsing in advance and at the same time need to take some action at the start of a try block, but I don't know if any such languages exists.

Let's leave aside languages without an explicit way to declare arbitrary blocks (such as Python). In all the others, is there a technical reason for requiring a try keyword (or equivalent)?

like image 513
Angew is no longer proud of SO Avatar asked Apr 23 '14 08:04

Angew is no longer proud of SO


2 Answers

The general idea when designing languages is to indicate as early as possible what construct you're in, so that the compiler doesn't have to perform unnecessary work. What you suggest would require remembering every {} block as a possible try block start, only to find that most of them aren't. You will find that every statement in Pascal, C, C++, Java, etc is introduced by a keyword with the sole exception of assignment statements.

like image 50
user207421 Avatar answered Sep 22 '22 17:09

user207421


There are several kinds of answers to this question, all of which might be relevant.

The first question is about efficiency and a distinction between compiled and interpreted languages. The basic intuition is correct, that the details of syntax don't affect generated code. Parsers usually generate an abstract syntax tree (be that explicitly or implicitly), be it for compilers or interpreters. Once the AST is in place, the details of syntax used to generate the AST are irrelevant.

The next question is whether requiring an explicit keyword assists in parsing or not. The simple answer is that it's not necessary, but can be helpful. To understand why it's not necessary, you have to know what a "lookahead set" is for a parser. The lookahead set is a set of tokens for each parsing state that would be correct grammar if they were to appear next in the token stream. Parser generators such as bison model this lookahead set explicitly. Recursive descent parsers also have a lookahead set, but they are often do not appear explicitly in a table.

Now consider a language that, as proposed in the question, uses the following syntax for exceptions:

block: "{" statement_list "}" ;
statement: block ;
statement: block "catch" block ;
statement: //... other kinds of statements

With this syntax, a block can either be adorned with an exception block or not. The question about ambiguity is whether, after having seen a block, whether the catch keyword is ambiguous. Assuming that the catch keyword is unique, it's completely unambiguous that the parser is going to recognize an exception-adorned statement.

Now I said that it's helpful to have to have an explicit try keyword for the parser. In what way is it helpful? It constrains the lookahead set for certain parser states. The lookahead set after try itself is the single token {. The lookahead set after the matching close brace is the single keyword catch. A table-driven parser doesn't care about this, but it makes a hand-written recursive descent parser a bit easier to write. More importantly, though, it improves error handling in the parser. If a syntax error occurs in the first block, having a try keyword means that error recovery can look for a catch token as a fence post at which to re-establish a known parser state, possible exactly because it's the single member of a lookahead set.

The last question about a try keyword have to do with language design. Simply put, having explicit keywords in front of blocks makes the code easier to read. Humans still have to parse the code by eye, even if they don't use computer algorithms to do it. Reducing the size of the lookahead set in the formal grammar also reduces the possibilities of what a section of code might mean when first glanced at. This improves the clarity of the code.

like image 21
eh9 Avatar answered Sep 19 '22 17:09

eh9