Question

In all exception-aware languages I know (C++, Java, C#, Python, Delphi-Pascal, PHP), catching exceptions requires an explicit try block followed by catch blocks. I was often wondering what the technical reason for that is. Why couldn't we just append catch clauses to an otherwise ordinary block of code? As a C++ example, why do we have to write this:

int main()
{
  int i = 0;
  try {
    i = foo();
  }
  catch (std::exception& e)
  {
    i = -1;
  }
}

instead of this:

int main()
{
  int i = 0;
  {
    i = foo();
  }
  catch (std::exception& e)
  {
    i = -1;
  }
}

Is there an implementation reason for this, or is it just "somebody first designed it that way and now everyone is just familiar with it and copies it?"

The way I see it, it makes no sense for compiled languages - the compiler sees the entire source code tree before generating any code, so it could easily insert the try keyword in front of a block on the fly when a catch clause follows that block (if it needs to generate special code for try blocks in the first place). I could imagine some use in interpreted languages which do no parsing in advance and at the same time need to take some action at the start of a try block, but I don't know if any such languages exists.

Let's leave aside languages without an explicit way to declare arbitrary blocks (such as Python). In all the others, is there a technical reason for requiring a try keyword (or equivalent)?

Was it helpful?

Solution 2

There are several kinds of answers to this question, all of which might be relevant.

The first question is about efficiency and a distinction between compiled and interpreted languages. The basic intuition is correct, that the details of syntax don't affect generated code. Parsers usually generate an abstract syntax tree (be that explicitly or implicitly), be it for compilers or interpreters. Once the AST is in place, the details of syntax used to generate the AST are irrelevant.

The next question is whether requiring an explicit keyword assists in parsing or not. The simple answer is that it's not necessary, but can be helpful. To understand why it's not necessary, you have to know what a "lookahead set" is for a parser. The lookahead set is a set of tokens for each parsing state that would be correct grammar if they were to appear next in the token stream. Parser generators such as bison model this lookahead set explicitly. Recursive descent parsers also have a lookahead set, but they are often do not appear explicitly in a table.

Now consider a language that, as proposed in the question, uses the following syntax for exceptions:

block: "{" statement_list "}" ;
statement: block ;
statement: block "catch" block ;
statement: //... other kinds of statements

With this syntax, a block can either be adorned with an exception block or not. The question about ambiguity is whether, after having seen a block, whether the catch keyword is ambiguous. Assuming that the catch keyword is unique, it's completely unambiguous that the parser is going to recognize an exception-adorned statement.

Now I said that it's helpful to have to have an explicit try keyword for the parser. In what way is it helpful? It constrains the lookahead set for certain parser states. The lookahead set after try itself is the single token {. The lookahead set after the matching close brace is the single keyword catch. A table-driven parser doesn't care about this, but it makes a hand-written recursive descent parser a bit easier to write. More importantly, though, it improves error handling in the parser. If a syntax error occurs in the first block, having a try keyword means that error recovery can look for a catch token as a fence post at which to re-establish a known parser state, possible exactly because it's the single member of a lookahead set.

The last question about a try keyword have to do with language design. Simply put, having explicit keywords in front of blocks makes the code easier to read. Humans still have to parse the code by eye, even if they don't use computer algorithms to do it. Reducing the size of the lookahead set in the formal grammar also reduces the possibilities of what a section of code might mean when first glanced at. This improves the clarity of the code.

OTHER TIPS

The general idea when designing languages is to indicate as early as possible what construct you're in, so that the compiler doesn't have to perform unnecessary work. What you suggest would require remembering every {} block as a possible try block start, only to find that most of them aren't. You will find that every statement in Pascal, C, C++, Java, etc is introduced by a keyword with the sole exception of assignment statements.

Speaking from a practical viewpoint: by specifying the try you allow for better modularity in catching exceptions. Specifically, it allows for cleaner nesting of exception handling. To add to EJP's answer, it adds to the readability when the catching blocks are embedded within others. Readability is an important consideration, and when there are multiple nested {} blocks, the try adds an excellent reference point for discrete catches.

Requiring that control structures which attach to the ends of blocks be paired with indicators before those blocks avoids confusion in scenarios like:

if (condition1)
  do {
    action1();
  } while(condition2);
else
  action2();

Imagine that instead of do statement; while(condition) C had used the syntax statement; until(!condition); Does that make things more or less clear?

if (condition1)
  {
    action1();
  } until(!condition2);
else
  action2();

I would consider the former code snippet perfectly readable without requiring a separate compound statement on the first if (not having a separate compound statement suggests a loop whose first-pass condition is given in the if, and whose repeat condition is given below, with a special zero-iterations handler below that). I would consider the second version seems much less clear. One could clarify the second by enclosing the loop in a compound-statement, but that would effectively add more verbosity than the do.

I asked a question which implies an answer to this one.

Are try blocks necessary or even helpful for the "zero-cost" stack unwinding strategy?

Explicit try blocks can allow for more efficient implementation of exception handling, especially when exceptions are thrown. The two popular strategies to implement exceptions are "setjmp/longjmp" and "zero-cost".

The setjmp/longjmp strategy, named for functions in the C standard library, saves context information upon entering a try block. That information will roughly be "in this context, exceptions of this type jump to this address, exceptions of this other type jump to that address, and other exception types bubble up the context stack". This allows thrown exceptions to find the matching catch quickly but require saving context during runtime, even when no exceptions are thrown.

In the zero-cost strategy, try blocks have no inherent cost but finding the catch block for a thrown exception is slow. Instead of saving context information at run-time upon entering a try block, the compiler builds tables at compile-time that can be used to find catch blocks given the origin of a thrown exception. The table specifies instruction ranges and the associated catch blocks. One way to implement this would be with variable range size and binary search.

The setjmp/longjmp strategy requires try blocks to know when to save context.

The zero-cost strategy does not depend on try blocks.

Since the two methods have trade-offs in efficiency, it makes sense to leave the choice up to language implementers and to provide the needed explicit try blocks for the setjmp/longjmp strategy.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top