Hiding Methods

This week, I was trying to rewrite the inheritance checker for generics. The inheritance checker is the compiler pass which makes sure that there aren’t any problems with overridden methods, and that only abstract classes can have abstract methods.

The question I was faced with was: Can something inherit from the same generic interface twice, with different type arguments? For example:

interface Processor<T>
{
  void process(T t);
}
class MultiProcessor implements Processor<int>, Processor<string>
{
  void process(int i) {}
  void process(string s) {}
}

This makes conceptual sense, and doesn’t have any ambiguities, so it really should be allowed. Unfortunately, there are two problems that can arise if this is allowed. The first is with the low-level implementation of method overriding, which can be changed; but the second is an ambiguity that can’t be solved without substantial changes to the language. Here’s an example that should illustrate the second problem:

interface List<T>
{
  void add(T t);
  T get(uint index);
  uint size();
}
class MultiList implements List<int>, List<string>
{
  void add(int i) { ... }
  void add(string s) { ... }
  int get(uint index) { ... }
  string get(uint index) { ... }
  // what about size()?
}

MultiList tries to be two different lists at the same time. The add() and get() methods should work fine, because they have different signatures (in plinth, the return type is part of a method’s signature). However, the size() method has the same signature no matter which list it is from. Obviously we don’t want both lists to have the same implementation of size() – we might have 2 ints and 7 strings. In order for everything to still work when we cast from MultiList to List<int> or List<string>, we need to have two different implementations.

But with the current system, if you override a method, it overrides all of the super-type methods with the same signature, making it impossible to have two different size() implementations. The solution is to change the language to allow methods to be hidden instead of overridden. For the unfamiliar, here’s a quick description of the difference:

  • Hiding is what happens to fields: when you make a new one with the same name in a subtype, the one in the supertype still exists, but you can’t access it from the subtype (if you need to access it, you can cast to the supertype first and access it there).
  • Overriding is what happens to methods: when you make a new one with the same signature in a subtype, the one in the supertype gets overwritten with your new one, so that everywhere you access it you are referring to the same thing.

What we actually need to do to solve this sort of problem is to hide a method while providing an implementation for it, i.e. override it and hide it at the same time. We could provide implementations for both size() methods to do different things, and then hide one (or both) of them, so that you get two different answers when you do:

(cast<List<int>> multi).size()
(cast<List<string>> multi).size()

When you do multi.size(), you get whichever of them is not hidden, or a compiler error if both of them are hidden.

The syntax I am considering for this is similar to C♯’s “explicit implementation” syntax for interfaces, but more flexible:

class MultiList implements List<int>, List<string>
{
  // ... add() and get() methods ...
  hiding uint List<int>::size() { ... } // overrides List<int>::size()
  uint size() { ... } // overrides List<string>::size()
}

This gives us implementations for each of the size() methods separately, and hides List<int>::size() so that only List<string>::size() is accessible from MultiList. If we want to call List<int>::size(), we must first cast the MultiList to a List<int>.

This syntax will also allow you to hide a method and provide a new method with the same signature, without overriding it. For example:

class Bar
{
  uint parse(string s) { ... }
}
class Foo extends Bar
{
  hiding uint Bar::parse(string s);

  uint parse(string s) { ... }
}

So, in order to hide a super-type’s method, we declare it with the ‘hiding‘ modifier, and instead of just giving its name, we use SuperType::name.

Note: I am not completely decided on the keyword ‘hiding’. While ‘hidden’ might make more sense, I do not expect this feature to be used very often, so I don’t want to use a keyword which people often use as a variable name.

Low-Level Hiding

The other problem I mentioned was with the low-level implementation of overriding. Currently, each method has a “disambiguator”, which is a string representation of its signature, including its name and its parameter and return types. Methods are only overridden by the plinth runtime if their disambiguator strings are equal.

One problem with this way of doing things is that it doesn’t cope with inheriting members from the same super-type twice. To override both List<int>::get() and List<string>::get(), you would need to provide two methods which have different disambiguators, but the super-type only has one disambiguator for List<T>::get(), so they cannot override it properly with the current system.

Another problem is that it doesn’t allow a method to not override a superclass method, unless it picks a deliberately non-matching disambiguator (and doing so would be bad for binary compatibility).

To solve these problems, the way the runtime generates VFTs will have to be dramatically changed in a way I haven’t fully thought out yet. Luckily, this part of the runtime is only run once when the program is starting up, so the minor efficiency tradeoff that this will probably necessitate is almost certainly worth it.

 

For now, I’ll get back to implementing generics without support for inheriting from the same type multiple times, and support for these features will be added later on.

7 thoughts on “Hiding Methods

  1. Adam Johnson

    Why should we be able to access the size() of both? It’s encouraging bad coding if you can implement two interfaces that intersect on their names of things that you can still implement and access both…

    If I’m looking at a multilist being passed around in some piece of code, I’ll need to know too much about its implementation to even understand that both:

    (cast<List> multi).size()
    (cast<List> multi).size()

    should be expected to be different.

    “Obviously we don’t want both lists to have the same implementation of size() – we might have 2 ints and 7 strings.” – really? Shouldn’t my MultiList just have a single size – of 9 items?

    I don’t even get why you need it for fields. Say I subclass SomeGraphicsLibrary’s Point to ‘HighPrecisionPoint’ to modify the X and Y fields to doubles – I can’t do that if I’m actually hiding the fields on the superclass, and I end up having to reimplement everything, no?

  2. Anthony

    Wow, thanks for asking. When I wrote this, I wasn’t even considering what use-cases there might be for method hiding, I’d just assumed that because it makes the language more flexible, eventually someone would find it useful. Now that I think about it, I’m not sure that it’s justified. I’ve just found a blog post by Eric Lippert on why method hiding is allowed in C♯ here, but I’m not sure that the reasoning that he gives is enough to justify it, especially in a language that has the return type of a method as part of the method’s signature.

    In the case that I’ve described, it’s impossible to inherit from two Lists correctly without method hiding, since you’d be breaking the contract of List<int> if you returned 9 from size() but then couldn’t get() 7 of the elements of the list because they’re strings. However, you could just as easily have each list as a separate field on the class, and then access those fields instead of casting the MultiList, and I would argue that it probably makes more sense to do it that way.

    So thanks for saving me from spending a lot of time implementing an almost-useless feature.

    One thing that I will still be doing is allowing you to inherit from the same interface multiple times, in case you want to have something which is both an EventHandler<Foo> and an EventHandler<Bar>. I’ll still have to fix the low-level representation to allow for that, but it won’t have to be as drastic as I imagined.

  3. Adam Johnson

    No worries! Haha. That’s a very interesting blog post, hadn’t considered the idea of virtual/non-virtual dispatch before.

    Of course, I’m not a fan of the two-ways-to-do-things approach. The fewer options there are, the slimmer the language, and thus the smaller the barrier to entry but also the more universal the code. If you made it so Plinth only compiled 4-spaces tabbed code and was very particular about where braces and indentation were, you’d save a whole lot of effort down the line 😉

    Sorry about the last paragraph on my post though, I just realized whilst cooking it’s a confusing piece of drivel from someone who hasn’t used a strongly-typed language in a couple of years. Forget I said it!

  4. Anthony

    Now that you mention it, I’m not really a fan of the two-ways-to-do-things approach either. I’ve actually done some of that already: braces are required on if, for, and while statements, and tabs are explicitly disallowed, because of the tendency for code to end up horribly indented if you/your editor mixes them with spaces.

    Your last paragraph does make sense from the point of view of the user of an object: casting and accessing a field of the same name shouldn’t really give you a different field – it’s confusing. Ideally, non-private fields would be replaced by properties (which are overridden, not hidden). While that wouldn’t let you change the apparent type, it would let you change the getter/setter’s implementation, and store the data a different way.

  5. Adam Johnson

    Ah very nice! Python’s tying whitespace into the language definition is good, but I do wish they had banned tab characters!

    And ah yes, that makes sense. How are properties in plinth?

    Also as a side note, I recently read “Javascript: the good parts” which is worth it for very heavy-handed opinions on Javascript which he makes look like one of the worst designed languages in history. Perhaps learning about its faults might help you when making decisions in the future?

  6. Anthony Post author

    Properties are designed to be simple to use, so they have backing variables, getters, and setters by default. But they’re also flexible enough that they can get a bit complicated if you want to do certain things, like making them final or having custom setters (properties often have a constructor as well as a setter, which has to be called during object initialisation, so the initialisation and inheritance rules can be quite complicated).

    Thanks for the recommendation, I’ll have to check it out some time. I already know a few of the silly things about it, the weak type system comes to mind, as well as being able to create variables called ‘undefined’. My main aim with plinth is to design it sensibly and so that it’s intuitive to use (which gets quite difficult sometimes, see the post I’ve just put up), so knowing about other languages’ faults is always useful.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.