Generics, Nullability, and Immutability

The last few weeks, I have been working on Generics. Last week, with some awkward grammar hacks, I implemented a parser for them that allows the ‘>>’ token to close two layers of type argument at a time. But this week I have come to a more difficult problem: how do nullability and immutability (the “type modifiers”) work with generics?

I knew that this would be difficult, so the first thing I considered was what problems I was solving by not just disallowing nullable/immutable type arguments. It quickly becomes obvious that doing this would be a bad idea, as it would prevent you from making a List<?Foo> or a Set<#Foo> (which would make generic collections inferior to arrays!).

Before I go into any more detail, I should explain how generics will work in Plinth. Other languages use several different approaches:

  • C++ makes each generic type a template, which is instantiated whenever it is needed. This means re-generating the low-level code for the generic type whenever it is used.
  • Java type-checks everything at compile time, and then erases all of the type information at runtime. This makes it impossible to do several things, like create generic arrays.
  • C♯ uses a system somewhat similar to Java’s, but keeps the run-time-type-information around, which makes it much more flexible.

The C++ approach makes generic nullability and immutability much easier for the compiler, because before it generates code for a generic function, it already knows exactly what type it is working on. The main problem with this approach is that it requires code to be generated at the point where the type is used. This means that any changes made to a generic type will not take effect until all uses of it are recompiled. This type of system would not work for Plinth, because it makes much stricter guarantees about binary compatibility.

Plinth will use a system very similar to C♯’s, but with the type system expanded to support nullability and immutability. One very minor difference is how Plinth will treat value types: it will implicitly cast them all to object, making them pointer types rather than value types. This is necessary so that Plinth only has to generate code for each generic type/method once.

Nullability

The difficult thing about mixing generics with nullability is that in general you can’t know whether or not a type argument is nullable. The following example should illustrate this:

class Foo<T>
{
  T field;
  void foo()
  {
    object o = field; // error: T might be nullable
    field = null; // error: T might not be nullable
  }
}

Both of the errors would actually become a problem when we provide certain type arguments. For example, the first error is set off when we make a Foo<?object>, and the second is set off by Foo<object>.

The compiler represents this type internally as a “possibly-nullable” type, which exists in between nullable and not-nullable. The type hierarchy says that not-nullable is a subtype of possibly-nullable, and possibly-nullable is a subtype of nullable. This allows you to, for example, convert from a T to a ?object, or to convert from a not-null subtype of T to a T.

This subtype relationship gives an obvious way of specifying to the compiler that you don’t want a type to be nullable: say it extends a (not-nullable) object. Unfortunately, there is no easy way to say that a type parameter should always be nullable, but if this is ever necessary it can be worked around by always using ?T instead of T. For example:

class Bar<T extends object, S>
{
  T notNullableField;
  ?S alwaysNullableField;
}

The full inheritance hierarchy for nullability is:

         nullable
            ^
            |
possibly-nullable
            ^
            |
     not-nullable

A <-- B means A is a super-type of B

Immutability

At first glance, immutability looks very similar to nullability where generics are concerned. It has a very similar subtype relationship: not-immutable is a subtype of possibly-immutable, and possibly-immutable is a subtype of immutable. Immutability just has the added restriction that you can’t downcast from immutable to possibly-immutable or from possibly-immutable to not-immutable.

This is extremely close to a complete model, and almost works just as well as the one for nullability. It is only a tiny edge case which makes it completely different in practise.

An edge case

Inside an immutable function, any fields you access (such as a property’s backing variable) are implicitly considered both final and immutable. So normally, from a method, you wouldn’t be able to return one of your object’s fields without the result type being ‘#Foo’ instead of just ‘Foo’. Also, by default, a property’s getter is an immutable function.

With just these rules, a getter would either be unable to return a non-immutable object, or it could be marked as ‘mutable’, which allows it to return a non-immutable object but stops it from being called from inside an immutable function.

Luckily, there is an extra rule: when you access a property (as with a field), the resulting value is implicitly immutable if the object you are accessing it on is immutable.

This extra rule doesn’t change anything about what you can do with immutable values, but it does allow us to add a special case to property getters. If a property getter is an immutable function, and the property is of some type ‘Foo’, and the getter tries to return a field of type ‘Foo’ (which is considered implicitly immutable because the getter is an immutable function), then the getter is allowed to return the value even though it is immutable.

To model this type of scenario, the compiler has a concept of ‘contextual immutability’. This is an extension to the type system which cannot be represented explicitly in code, but represents what happens if something gets made immutable purely because you are accessing it in an immutable context.

Generic Immutability

On top of contextual immutability, generics adds the concept of a possibly-immutable value. What happens if we combine them?

Say we tried to access a field with a possibly-immutable type in an immutable context. If the type argument turns out to be immutable, then the value should be immutable, but if the type argument turns out to be not-immutable, the result should be contextually immutable. This situation could be called something like ‘immutable-or-contextually-immutable’. Obviously, this makes the type system much more complicated, but it is required in order to allow properties to deal with generic types properly.

The full inheritance hierarchy for immutability is:

                     immutable
                      ^     ^
                     /       \
                    /         \
                   /        immutable-or-contextually-immutable
                  /            ^
contextually-immutable         |
                 ^             |
                  \         possibly-immutable
                   \           ^
                    \         /
                     \       /
                   not-immutable

A <-- B means A is a super-type of B

Of course, the only ones that you can explicitly specify are ‘immutable’ and ‘not-immutable’.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.