{"id":218,"date":"2013-02-03T16:56:18","date_gmt":"2013-02-03T16:56:18","guid":{"rendered":"http:\/\/bryants.eu:7564\/blog\/?p=218"},"modified":"2013-02-03T16:59:53","modified_gmt":"2013-02-03T16:59:53","slug":"casting-and-instanceof","status":"publish","type":"post","link":"https:\/\/bryants.eu\/blog\/2013\/02\/casting-and-instanceof\/","title":{"rendered":"Casting and Instanceof"},"content":{"rendered":"<p>Last week, I implemented the <code>instanceof<\/code> operator. Doing this made me consider in much more detail how casting and instanceof should work, as there are several ways that they could.<\/p>\n<h2>The Unified Type System<\/h2>\n<p>A unified type system is one where every type can be cast to some all-encompassing super-type. In Plinth, the name of that type is &#8216;<code>object<\/code>&#8216;, or to be more specific &#8216;<code>?#object<\/code>&#8216; in order to account for nullability and immutability.<\/p>\n<p>When you cast, say, a <code>double<\/code> to an <code>object<\/code>, you are implicitly performing a heap allocation so that the <code>double<\/code> can be stored on the heap inside something that looks like an <code>object<\/code>. In fact, it is just an <code>object<\/code> with a <code>double<\/code> tacked onto the end, and with its virtual functions replaced by ones which extract that <code>double<\/code> and call the real functions. The same system is used to cast tuples, compounds, functions, and all of the other primitives to objects.<\/p>\n<p>On the other hand, arrays and classes already look enough like objects to be cast to them directly (using the same pointer).<\/p>\n<p>Because we might want to use <code>instanceof<\/code> on an <code>object<\/code>, or check whether to throw a <code>CastError<\/code> before we actually do a cast, every object stores a pointer to a block of run-time type information (RTTI). This RTTI holds data about the sort of type (e.g. &#8216;primitive&#8217; or &#8216;array&#8217;), and the properties of the individual type (e.g. the kind of primitive it is, or\u00a0an array&#8217;s base type). When we cast a <code>double<\/code> to an <code>object<\/code>, that <code>object<\/code> has an RTTI block for <code>double<\/code>.<\/p>\n<h2>Casting<\/h2>\n<p>Different casts are performed in very different ways. A cast from <code>uint<\/code> to <code>long<\/code> simply zero-extends the binary value from 32 bits to 64 bits, whereas a cast from a class <code>Foo<\/code> to <code>object<\/code> is just a matter of reinterpreting the same pointer as a different type.<\/p>\n<p>Casting from an <code>object<\/code> to some other type is much more complicated. If the RTTI for the object does not match the destination type properly, we need to throw a <code>CastError<\/code>. If it does match, we might have to extract a value from inside the object, or maybe reinterpret that object as a class, or possibly search through the RTTI for a super-interface&#8217;s VFT and tuple it with the object pointer.<\/p>\n<p>But what happens if we do the following:<\/p>\n<pre>long a = 5;\r\nobject obj = a;\r\nint b = cast&lt;int&gt; obj;<\/pre>\n<p>The <code>object<\/code> is a <code>long<\/code>, but <code>long<\/code>s can be cast to <code>int<\/code>s easily &#8211; they are just a truncation from 64 bits to 32 bits.<\/p>\n<p>The problem is knowing that <code>obj<\/code> is a <code>long<\/code>. It might be anything from a <code>boolean<\/code> to a <code>string<\/code>, in which cases the result should definitely be a <code>CastError<\/code>. Since the RTTI for <code>long<\/code> doesn&#8217;t match the values we expect for an <code>int<\/code>, the cast won&#8217;t be allowed.<\/p>\n<p>However, with a lot of work, it would be possible to allow it. We could write a really long line of checks for whether the run-time-type is <code>int<\/code> or <code>ubyte<\/code> or <code>boolean<\/code> or <code>float<\/code> or <code>{uint -&gt; string}<\/code> or <code>[][]double<\/code>. We would actually only need to check the types that we could convert from, but it would require a huge amount of LLVM code for such a small amount of Plinth code.<\/p>\n<p>This type of &#8220;transitive&#8221; cast gets especially cumbersome when you consider the fact that you can convert tuples like <code>(ubyte, short, boolean)<\/code> to <code>(int, long, boolean)<\/code>. If this type of cast were allowed, casting from an object to a tuple would require looking through the RTTI for each of the tuple&#8217;s values (recursively) to first check whether the conversion is possible, and second find out how to perform it.<\/p>\n<p>In Plinth, transitive casts are not allowed. In order to handle the sort of cast we tried to do above, we would have to do something like:<\/p>\n<pre>int b = cast&lt;int&gt; cast&lt;long&gt; obj;<\/pre>\n<h2>Instanceof<\/h2>\n<p>The instanceof operator (which might be renamed to &#8216;<code>is<\/code>&#8216; at some point) allows you to check whether a value is an instance of a given type. For example:<\/p>\n<pre>long a = 5;\r\nobject obj = a;\r\nboolean isInt = obj instanceof int; \/\/ false\r\nboolean isFoo = obj instanceof ?#Foo; \/\/ type error: cannot check against a nullable or immutable type<\/pre>\n<p>As shown, you cannot check whether something is nullable or data-immutable. This is because these are properties of the reference, not the value itself: even if the reference is immutable,\u00a0the value behind the reference usually won&#8217;t be. Similarly, the value behind a nullable field won&#8217;t itself be nullable &#8211; it can&#8217;t be null. The exception to this is when checking nested types: <code>[]boolean<\/code> is different from <code>[]?boolean<\/code>.<\/p>\n<p>There are several different ways which <code>instanceof<\/code> could have worked. For example it could return true whenever the value is assignable to the type without losing data, so a <code>uint<\/code> with value <code>3<\/code> would be an instance of <code>ubyte<\/code>, because it fits inside an 8 bit unsigned value. This scheme would have required the same sort of transitive checking described above.<\/p>\n<p>Another way would have been to return true whenever the assignment would work in a single step without losing data (i.e. the same, but without the transitive checks), which would illustrate exactly when casting would work. This could give you true for <code>(ubyte, Foo, string) instanceof (uint, Bar, string)<\/code>. It could even allow nullable values, returning true if a null value were checked against a nullable type. The problem is that this system is very often confusing and can&#8217;t tell you unambiguously whether this number you&#8217;ve got is a <code>ubyte<\/code>.<\/p>\n<p>Plinth uses the simplest and hopefully the most obvious system: <code>value instanceof type<\/code> is only true when the value is already an instance of type. For classes and interfaces, this means checking against all super-types as well, because those super-types are part of the sub-type. But comparing a <code>short<\/code> to an <code>int<\/code> will result in false even if the cast is known at compile time to be just a sign-extension to 32 bits.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Last week, I implemented the instanceof operator. Doing this made me consider in much more detail how casting and instanceof should work, as there are several ways that they could. The Unified Type System A unified type system is one where every type can be cast to some all-encompassing super-type. In Plinth, the name of [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"_links":{"self":[{"href":"https:\/\/bryants.eu\/blog\/wp-json\/wp\/v2\/posts\/218"}],"collection":[{"href":"https:\/\/bryants.eu\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/bryants.eu\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/bryants.eu\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/bryants.eu\/blog\/wp-json\/wp\/v2\/comments?post=218"}],"version-history":[{"count":8,"href":"https:\/\/bryants.eu\/blog\/wp-json\/wp\/v2\/posts\/218\/revisions"}],"predecessor-version":[{"id":225,"href":"https:\/\/bryants.eu\/blog\/wp-json\/wp\/v2\/posts\/218\/revisions\/225"}],"wp:attachment":[{"href":"https:\/\/bryants.eu\/blog\/wp-json\/wp\/v2\/media?parent=218"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/bryants.eu\/blog\/wp-json\/wp\/v2\/categories?post=218"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/bryants.eu\/blog\/wp-json\/wp\/v2\/tags?post=218"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}