More Exceptions: Semantics

Last week, I discussed how the internals of the stack unwinder work during exception handling. This week, I am going to talk about how Plinth’s exceptions look from a programmer’s point of view, and their semantics.

Exceptional Type System

Everything which can be thrown in plinth implements an interface called Throwable. If something implements Throwable, it can be thrown and caught using throw statements and try-catch blocks.

Underneath Throwable, there are two classes: Error and Exception. The distinction between these is slightly fuzzy, but the main idea is:

Anything which a programmer should expect to have to handle should be an Exception.
Anything which a programmer shouldn’t expect should be an Error.

For example, a programmer shouldn’t expect a cast to fail. Since the value being cast is completely under their control, they should check the value for themselves before casting. Thus, a cast failure results in a CastError (which extends Error).

In another case, a programmer might be interacting with some system they don’t have control over, such as the file system. If they try to open a non-existent file, the I/O library might throw a FileNotFoundException (which extends Exception).

Checked Exceptions

This is a feature a lot of languages either skip (C♯) or get slightly wrong (Java).

A checked exception is an exception that the compiler checks the propagation of. If something throws some checked FooException, then that FooException must either be caught or be declared to be thrown onwards.

In Java, this ensures that if something might throw an exception, you have to declare it as being rethrown at every stage until it’s caught. This can be a big problem if you want to use higher order functions (e.g. as described here), since the higher-order function usually has no idea what exceptions might be thrown through it. You can get around this problem by tunnelling: boxing a checked exception inside an unchecked exception and re-throwing that. The problem with this is that in order to catch the boxed exception again, you must write a catch block which matches any instance of the unchecked exception, and do instanceof checks on the one boxed inside it.

In my opinion, Plinth handles cases like this much better than Java. Instead of having certain types which are checked and others which are unchecked, Plinth considers every Throwable to be checked. However, it allows you to rethrow them as unchecked:

void mightThrow() throws Error, unchecked FooException
{
  if someTest()
  {
    throw new FooException();
  }
  throw new Error();
}

void caller() throws Error
{
  mightThrow();
}

Here, mightThrow() has to declare all of the Throwables it might throw. Because it can throw both FooException and Error, and doesn’t catch either of them, it must have both in its throws clause.

The point here is that it declares FooException as unchecked. Because of this, caller() doesn’t have to declare that it could throw FooException. Nevertheless, it must declare that it throws Error, since it doesn’t catch it.

One point to mention here is that any method can declare that it throws any Throwable type without generating compiler warnings for unnecessary declared exceptions. This is because it could always be rethrowing an unchecked exception as a checked exception.

This system still has the useful property that you have to deal with every exception somewhere, the difference is that now you can explicitly decide to ignore it without writing a whole try-catch statement.

Try-Catch-Finally

Try statements are very useful not just in exception handling, but in general for cleanup code. They function very similarly in Plinth to how they do in other languages.

The basic semantics are that the try block tries to execute, and if it fails at any point by throwing an exception, the language checks each of the catch blocks to see if any of them match the exception. If any do, the first matching catch block is run. The finally block is run after all of the try/catch blocks have finished, even if any of the try or catch blocks throw an exception, return, or try to break out of or continue through an enclosing loop. After the finally block has been run, execution continues wherever it was going before it entered the finally, for example it could continue throwing an exception, or it could branch back to the start of a loop.

An example of a try-catch-finally statement is as follows:

try
{
  callSomething();
}
catch IOException | FooException exc
{
  stderr::println("Caught an IOException or a FooException!");
}
catch Error err
{
  stderr::println("Caught an Error!");
}
finally
{
  doSomeCleanup();
}

As you can see, multiple exceptions can use the same catch block. In the first catch block above, the type of exc is the lowest common super-type of IOException and FooException (or Throwable if there are two equally-close super-types).

Control Flow of Try-Catch-Finally

Control flow checking is a compiler pass which tracks which variables are initialised where. It primarily maintains two sets: the set of initialised variables, and the set of possibly-initialised variables. These sets are used to check that variables are not used before they are initialised, and that final variables are not initialised twice. During a constructor, the control flow checker checks some extra things, such as member variables.

Try-catch-finally statements are easily the most complicated statements in terms of control-flow. If a finally block is involved, the semantics involved in tracking which variables are initialised are very tricky to define. To illustrate this, here is a sketch of the threads of control flow that can happen in a try-catch-finally statement (ignoring break and continue):

The arrow coming from the top is where control begins, and the one exiting at the bottom is where it goes on to any statements after the try-catch-finally statement.

The first thing to notice is that on the way to join-up point (a), the try statement may not be executed. This models the fact that an exception may be thrown at any point during the try statement, causing it to terminate abruptly. In fact, any lines of control flow in the try statement that stop execution (i.e. with a break/continue/throw/return) are also combined into join-up point (a). This allows that point to model accurately which variables have been initialised after the try statement halts abnormally.

After point (a), control goes through the catch blocks. Since each of these can also terminate with a break/continue/throw/return, we must also short-circuit all of the catches on the way to the finally block, as we did with the try block. We then reach join-up point (b).

Next, we must process the finally block. There is only one finally block, but the diagram contains it twice. Finally [1] is to do a control-flow check on the finally block with an accurate model of all possible control flow states entering it. Whereas finally [2] is there to help us find out which variables are initialised when control continues on to the statement immediately after the end of the try-catch-finally statement.

At join-up point (b), we have an accurate model that says which variables definitely are initialised and which might be initialised just before the finally block. We use this to check that the finally block doesn’t break any control-flow rules about variables.

In order to model the control flow after the try-catch-finally statement, we combine together the states at each of the returning try and catch blocks at join-up point (c). If a try/catch block doesn’t return, we don’t combine it into that join-up point, because it won’t affect what happens to the control flow after the finally block. Finally, finally [2] finishes, and we can continue to process whatever statements are next.

But what happens after finally [1]? The answer is that it can go to many different places, depending on what caused us to enter the finally block in the first place. If someone broke out of some loops through a finally block, control would branch to after the outer loop. If someone returned, the function might return. On the other hand, there might be another finally block higher up in the function that we also have to process before returning. In general, execution can go to lots of different places. To implement this, the code generator will create an internal integer variable which will decide where to jump to after the finally block finishes.

Anthony's Blog

My very own piece of write-only memory.

More Exceptions: Semantics

Exceptional Type System

Checked Exceptions

Try-Catch-Finally

Control Flow of Try-Catch-Finally

Leave a Reply