Author Archives: Anthony

Cosmic Signpost – Tracking More Stuff

My post on Moon Tracking covers most of the complexities involved in tracking things, but not quite everything. I chose to talk about the Moon first because it covered all of the most complex parts of the tracker. This post covers all of the rest of the things that the Cosmic Signpost can currently track.

Planets

Tracking planets in our solar system is only a small addition to the complexity of tracking the Moon.

JPL provides some approximate orbital parameters for tracking planets. They’re fairly simple models, so won’t be particularly accurate, but they’ve very simple to implement and given the distance from Earth the inaccuracies won’t be very noticeable.

However, they are all based on a heliocentric inertial reference frame. We have our ECI frame though, and the axes are in the same directions, so we should be able to subtract out the Earth’s position vector to get to that frame, right?

Unfortunately, the orbital parameters that JPL gives for Earth are actually the Earth-Moon-Barycentre. The Barycentre of a set of objects is the centre of mass of all of them together, and in the case of the Earth-Moon system, the Barycentre is about 4,671 km from the centre of the Earth, or most of the distance from the centre of the Earth to the crust. Since we already know how to track the Moon, we can track the Earth-Moon-Barycentre by multiplying the Moon’s position vector (from Earth) by the ratio of the Moon’s mass to the Earth’s mass, which is 7.349×10²² / (5.972×10²⁴ + 7.349×10²²) = 0.0121561693.

If we take the position vector from Earth to the Barycentre (which we can calculate from our Moon tracking as above), and subtract the position vector from the Sun to the Barycentre (which we can calculate based on JPL’s approximate planet positions), this will give us the Sun’s position in ECI. We can then add any heliocentric coordinates we like, and they will be in ECI, so now we can track the Sun and all of the Planets.

Distant stars

Distant stars (and galaxies, and whatever else is normally fixed in relation to the stars) are tracked using a Right Ascension and Declination. These are basically polar coordinates like latitudes and longitudes, but projected into the night sky. (In this analogy, the right ascension is the longitude, and the declination is the latitude).

We do know the distance of some stars, but it’s not always well documented and they’re so far away anyway that it shouldn’t really make a difference for tracking, so the simple (but slightly hacky) way to turn them into cartesian locations is to choose a distance that’s really far away, and make them all that distance from the Earth. I chose 10^100 m (which is ~10^84 light-years) for my range, because it’s a nice middle ground between being excessively far away and not overflowing a double-precision floating point number during calculations (they have a maximum value of 1.79×10^308).

These coordinates use the equatorial version of ECI discussed in the Moon Tracking post, so converting them to fixed coordinates just involves rotating the Earth as specified there. The Cosmic Signpost menu has list of interesting stars hardcoded into it, but you can also enter a Right Ascension and Declination manually and track that.

GPS Coordinates

These are the simplest to track, since we already have the code to convert from Latitude and Longitude to ECEF and we don’t even need to rotate the Earth. There’s not much to say here except that I’ve hardcoded a list of cities and other places that the Cosmic Signpost can track, and you can also enter a location manually using Latitude and Longitude coordinates (just like you can using Right Ascension and Declination for distant stars).

Earth’s Satellites

We’ve launched thousands of satellites into orbit around Earth over the last 70ish years, and there are standardised ways to track them that are quite accurate. The most well-known algorithm for this is called SGP4, and it factors in lots of things like the changing gravity from Earth, the Sun, and the Moon as well as the pressure of radiation coming from the Sun. It’s a very complicated algorithm that can be quite difficult to understand.

NORAD (which does most of this tracking) shares its data about the current orbits of satellites with CelesTrak.org, a non-profit that makes the data available publicly. CelesTrak has an API that can produce this data in multiple formats. The older format is the TLE (Two-Line Element Set), but it is annoying to parse and not future-proof (dates all have two-digit years, and the number of satellites that can be tracked is limited). The newer format is the Orbital Mean-Elements Message (OMM), which can be encoded in JSON, XML, or whatever is most convenient. CelesTrak provides an API for producing OMM messages that we can query by NORAD catalog number to look up any satellite that they provide data for. Here’s the International Space Station’s tracking information in JSON.

It’s important to note that the tracking data that CelesTrak provides is only useful if you use the SGP4 orbital propagation algorithm to find where a satellite is at a particular time. The orbital elements they provide won’t be accurate with any other algorithm. (There is also SDP4, which is a variation on SGP4 for satellites that are deeper in space, I’m referring to both algorithms as SGP4 here because the reference implementation switches between the two automatically.)

The most well-known implementation of SGP4 is by David Vallado, which he wrote as a companion to his book “Fundamentals of Astrodynamic and Applications”. His code is available via CelesTrak here.

Unfortunately, to a software engineer who is not an astronomer, the code is very difficult to read – most variables and functions have incomprehensible names, some of those variables are never actually used, and tens of variables at a time are often passed to sub-functions either by value or by reference without any explanation of why they’re passed that way. The parsing of TLEs is very coupled to the tracking, and there is no parser for OMM messages.

Partly because of some of these shortcomings, and partly in an effort to understand it better, I adapted this code to create my own version of the SGP4 propagator here (although I wouldn’t call it much of an improvement, because I still don’t fully understand the algorithm myself): https://github.com/abryant/CosmicSignpost/blob/main/lib/tracking/sgp4_propagator.cc

By giving the propagator a set of orbital elements and a time, we can get the current position of any trackable satellite in an equatorial ECI frame. The original aim of the Cosmic Signpost was to track the International Space Station, so by giving it NORAD Catalog Number 25544 we’ve accomplished that.

Things it can’t track (yet)

There are a few things that it would be possible to track in theory, but that I haven’t implemented in the Cosmic Signpost yet:

Asteroids
Comets
Moons of other planets (it’s easier to just track the planet itself)
Objects that are flying through our solar system without orbiting
Aeroplanes
Boats
GPS trackers that share their location

I’d be happy to accept pull requests for these, if anyone feels like implementing them.

Cosmic Signpost – Moon Tracking

Leave a reply

You need to do a lot of maths to point at the Moon. You might think it’s a simple orbit and there should be a simple way to calculate the Moon’s position based on the current time, and you’d be generally right except for the word “simple”. Let’s start by breaking this down into stages:

Find the Moon’s orbital parameters
Calculate its position in cartesian (x, y, z) coordinates
Transform those coordinates according to the Earth’s rotation
Find your own location in cartesian coordinates
Calculate the direction to point in

But before we can do any of that we need to decide on some reference frames.

I’m going to include code links to the Cosmic Signpost in each section.

Reference Frames

Code link: reference_frame.h

In astronomy there are several different reference frames you can use. The most common are:

International Celestial Reference Frame (ICRF): the celestial reference frame, centred at the centre of our solar system and fixed with regards to several distant star systems.
International Terrestrial Reference Frame (ITRF): the terrestrial reference frame, which is centred at the centre of the earth, and rotates with it.

Tracking things in space from the surface of Earth often requires converting between these two frames. However, a frame centred on the Sun like ICRF isn’t necessary for tracking the Moon – for that we need an Earth Centred Inertial (ECI) frame (which is centred on Earth but doesn’t rotate with it) in addition to ITRF.

First, to convert between ITRF and ECI we need to define our axes. We’ll be using a cartesian coordinate system called Earth Centred, Earth Fixed (ECEF) for ITRF, and something very similar but non-rotating for ECI. In ECEF, (0, 0, 0) is at the centre of the Earth, the Z axis points towards the north pole, the X axis points towards the point where the prime meridian touches the equator (GPS 0, 0), and the Y axis points 90 degrees east of that.

There are several steps to the conversion between ITRF and ECI, but because ITRF is rotating and ECI isn’t, we need to know the precise time whenever we convert between them. There’s a reference time called J2000, and we know the exact angles between the two frames at that exact time. From there we can use the speed of the Earth’s rotation (and other things) to figure out what the angles are at any particular time.

Unfortunately, the definition of our reference time J2000 is already quite confusing: it was originally defined as midday (not midnight, because astronomers observe at night) on the 1st of January in the year 2000. However, this was defined in Terrestrial Time, which (a) is an ideal that can’t be measured by real clocks, and (b) differs from UTC by around 32.184 seconds, so the actual value of J2000 is 11:58:55.816 UTC on the 1st of January, 2000. It’s worth noting that at the time of writing we have had 5 leap seconds since J2000 – not accounting for those could throw the calculations off, as leap seconds are designed to keep UTC in sync with the Earth’s rotation.

Next, let’s clarify which ECI we mean, because so far we’ve only said it’s an inertial frame centred at the centre of the Earth, not where the axes are. We have two planes we could use: the equatorial plane (which forms Earth’s equator) in the orientation that it was in at the moment of J2000, and the ecliptic plane (which the Earth orbits the Sun in), they differ by around 23.4° because of Earth’s axial tilt. Since the orbits of all of the planets and the Moon are usually defined in terms of the ecliptic plane, we’ll use that. We’ve decided that the Z axis points up from the ecliptic plane, and we can say that the Y axis is 90 degrees east of the X axis, but we still need to define the X axis. We can do this based on where the prime meridian was pointing at J2000, but the normal definition is actually based on where the mean vernal equinox was at J2000.

The vernal equinox happens in March every year, at the moment when the line that forms the intersection between the ecliptic and equatorial planes points directly towards the Sun. You’ll notice that this is in March, while J2000 is in January – the X axis is based on where the equinox line (i.e. that intersection of the two planes) is in January. The equatorial plane slightly changes over time because of slight variations in the Earth’s axis, which means the direction of the equinox line is constantly, but slowly, changing. Notice that since our X axis is along this line at J2000, we can perform a simple rotation around the X axis to compensate for the Earth’s axial tilt and convert between this ecliptic ECI frame (where the Z axis points upwards from the ecliptic plane) and an equatorial ECI frame (where the Z axis points towards the north pole).

The equinox line (blue, defining the X axis) at a random point in the Earth’s orbit around the Sun. The date in this diagram doesn’t coincide with the date of the equinox (when days and nights have an equal length), because the line is not currently pointing at the Sun.

Now that we have some reference frames defined, we can start to use them.

Finding the Moon’s Orbital Parameters

Code link: moon_orbit.cc

You might have heard that the Moon is very slowly drifting away from Earth at the rate of a few centimetres per year. This is only one of the ways that the parameters of its orbit are changing: it is constantly affected by its orbit around the Earth (which moves it closer to and further away from the Sun), the Earth’s orbit around the Sun, and potentially other things too.

The Keplerian Orbital Parameters are six numbers that can exactly define an elliptical orbit of one body around another. They are:

Semi-major axis: the radius of the orbit at its widest point
Eccentricity: the scale of how elliptical the orbital ellipse is (0 meaning circular, 1 and over meaning that it’s not orbiting any more)
Inclination: the angle of the orbital plane with respect to some reference frame (in our case, the ecliptic ECI frame)
Longitude of Ascending Node: the angle around the thing we’re orbiting (from some reference direction) where we measure the inclination from. Specifically, this is the angle where the Moon moves from below the Ecliptic plane to above it.
Argument of Periapsis: the angle around the orbit from the Longitude of Ascending Node where the two bodies are closest to each other.
Mean Anomaly: the current angle of the smaller body around the larger one, if you assume that the orbit is circular and the smaller body is moving at a constant speed around the larger one. (This is useful because it generally increases uniformly over time, and it can be used to calculate other parameters called the Eccentric Anomaly and True Anomaly, which are actual angles around the orbit).

For a much more in-depth and interactive explanation of how these 6 parameters work together see this page: https://ciechanow.ski/gps/#orbits – it’s geared towards explaining in detail how GPS works, but as part of that it goes into detail on all of the orbital parameters.

For the Moon, all six of these parameters vary over time. The best data source I could find for this was the JPL Horizons tool, which allows you to choose any body in the solar system and get a table that includes more than just these parameters for any range of dates/times.

To begin with, I downloaded all of the Moon’s orbital parameters at a daily resolution for 16384 days starting at midnight on the 1st of January 2000 (unfortunately, due to the way the data is exported, all of the dates were offset by -0.5 days from J2000, but this is correctable).

To analyse this large dataset, I wrote some code in Python to display graphs and do some numeric analysis. The first part of this was to perform a Fast Fourier Transform (FFT) to see which frequencies each component oscillates at. I manually gathered the top 8 frequency components from the graph of each FFT, since the aim is to find a good approximation rather than storing all of this data. For the semi-major axis, eccentricity, and inclination, just an FFT was enough. But the longitude of ascending node, argument of periapsis, and mean anomaly all have linear components too (i.e. the angle has a cumulative increase/decrease as well as oscillations), and their angles always wrap around when they get to 360°. To correct for this, I used a function in scipy called unwrap() with the period set to 360, which turns a time series like the one on the left into the one on the right.

And because this still has a linear component, I first approximated the linear gradient and subtracted it out before performing the FFT. From there, all six graphs look similar, although each have different oscillations.

Finding the initial frequencies was as simple as looking them up in the FFT graphs:

Next, I used a function similar to the following to define the curve-fitting process:

$$a_t = \sum_{i=1}^{8} (S_i sin(2 \pi F_i t) + C_i cos(2 \pi F_i t))$$

Where $a_t$ is the semi-major axis at time $t$, $F_i$ represents one of the eight frequency components (I chose 8 as an arbitrary trade-off between accuracy and complexity), and $S_i$ and $C_i$ are sine and cosine amplitudes. The parameters to optimise are $F_i$, $S_i$, and $C_i$. Using sine and cosine amplitudes is equivalent to using a magnitude and a phase, but it gives each of the parameters a differentiable effect on the output, which makes it easier to optimise than a phase angle. It’s possible to turn the sine and cosine amplitudes into magnitude and phase using the following transformation (example in desmos):

$$magnitude = sqrt(S^2 + C^2)$$

$$phase = atan2(C, S)$$

Functions like the one for $a_t$ above, with some additions for dealing with constant and linear components, were used as the input to scipy.optimize.curve_fit() to find the final parameters, which consist of:

Eight frequency components, for each of the six parameters.
Eight sine and eight cosine amplitudes, for each of the six parameters, which we use to calculate the magnitudes and phases using the formulae above.
A constant offset for each of the six parameters.
A linear time-multiplier for the Longitude of Ascending Node, Argument of Periapsis, and Mean Anomaly parameters.

We have now optimised these six functions on 16384 days (or just over 44 years) of data from 2000 to 2044. Testing it on the next 16384 days of data from JPL Horizons showed that it fits reasonably well. From there, we can calculate with reasonable accuracy the orbital parameters of the Moon at any point in at least the next ~70 years, but likely much longer.

Keplerian Orbits

Code link: keplerian_orbit.cc

So, how do we use these six orbital elements to calculate the position of the Moon? This is heavily based on NASA’s documentation here about finding the approximate positions of planets, but it applies to any Keplerian orbit including the Moon’s (if you have accurate orbital elements).

First, we need to find the x and y positions in the Moon’s orbital plane, and for that we need to find the Eccentric Anomaly ($E$) from the Mean Anomaly ($M$) and the Eccentricity ($e$):

$$M = E\ -\ e \cdot sin(E)$$

It’s not possible to solve this numerically for $E$, but we can approximate it using Newton’s method. We’re trying to find $M\ -\ E + e \cdot sin(E) = 0$, so let’s set:

$$F(E) = M\ -\ E + e \cdot sin(E)$$

$$E_0 = M\ -\ e \cdot sin(M)$$

(I suppose setting $E_0 = M$ might work too.) Then we can iterate this:

$$F'(E) = -1 + e \cdot cos(E)$$

$$E_{n+1} = E_n\ -\ F(E_n) / F'(E_n)$$

Repeating this until the delta ($F(E_n) / F'(E_n)$) is small enough gives us a good approximation of $E$. Then, we can use the eccentric anomaly $E$, the eccentricity $e$, and the semi-major axis $a$ to get $x$ and $y$, and our $z$ starts at the origin:

$$x = a (cos(E)\ -\ e)$$

$$y = a \sqrt{1\ -\ e^2} \cdot sin(E)$$

$$P_{initial} = (x, y, 0)$$

From there, we just need to rotate these coordinates into position. In my implementation, I used quaternions (extremely useful tutorial here) for this because they are a nicer way of representing arbitrary rotations and they’re easier to read than complex matrix multiplications. However, for simplicity below I’ve assumed an equivalent rotation matrix for each step:

$$P_{periapsis} = R_Z(argument\ of\ periapsis) \times P_{initial}$$

$$P_{inclined} = R_X(inclination) \times P_{periapsis}$$

$$P_{final} = R_Z(longitude\ of\ ascending\ node) \times P_{inclined}$$

This gives us $P_{final}$ as the cartesian position in our reference frame, in this case the ecliptic plane centred on the Earth.

Earth’s Rotation

Code links: earth_rotation.cc, and some of cartesian_location.cc

We can calculate the Moon’s position in ECI, but we need it in ECEF. The next step is to convert between them. There are several steps here:

Convert from Ecliptic ECI to J2000-Equatorial ECI (based on the J2000 north pole and equator), by applying the axial tilt.
Convert from J2000-Equatorial ECI to the ECI frame based on the current north pole and equator, this is known as CEP or “Celestial Ephemeris Pole”. This involves applying axial precession and nutation.
Convert from this CEP frame to ECEF, by applying the rotation of the Earth.

The ESA provides a full description of the steps on their Navipedia site, but it provides a lot of technical detail without much context, so I’ll just summarise here.

Step 1 is very simple, since we defined our ecliptic and equatorial planes with the X axis pointing along the line where they intersect, we can just rotate the coordinates by 23.4° on the X axis.

Step 2 is more complicated. The north pole (and therefore the equator) actually move around over time, in cycles on various different time scales. The largest movement is from axial precession, which means that over 26,000 years the Earth’s axis makes one full rotation. This actually means that in 13,000 years, the northern hemisphere will have summer in January and winter in July. It also means that the vernal equinox line by which we define the ECI X axis moves over time, doing a full circuit in 26,000 years.

The other part of step 2, Nutation, is higher-frequency than precession, and it occurs because the forces that lead to precession are not constant, but vary over the course of the Earth’s orbit due to gravity from the Sun, Moon, and planets. Applying it correctly involves using some long lookup-tables.

Step 3 is mostly about the rotation of the Earth on its (now well-defined) axis. There is also a component called Polar Motion, which is a slight drifting of the axis with respect to the Earth’s crust, but it has a tiny magnitude and requires frequent updates to calculate in an up-to-date way, so I’m ignoring it here.

Sidereal time diagram. Xaonon, CC BY-SA 4.0, via Wikimedia Commons.

The Earth rotates on its axis once every 23 hours, 56 minutes, and ~4 seconds. The difference between this and the 24 hours you might expect is because we need to measure the sidereal rotation (compared to ECI) rather than the rotation compared to the direction of the Sun from Earth, which is always changing. The extra 3 minutes and ~56 seconds add up over the course of a year to one extra sidereal rotation.

The details of this rotation on the ESA website are quite confusing. They define $\theta_G$ as the Greenwich Mean Sidereal Time (GMST), in terms of a fixed rotation over time compared to a reference $\theta_{G_0}$. The changing component is based on the current time in UT1 (which is very close to UTC), but they didn’t provide a reference epoch, so I initially assumed it was J2000, but in fact the reference is defined at midnight at the start of the current day. (In retrospect, they also say that $\theta_{G_0}$ is the GMST at $0^h$ UT1, but it wasn’t clear to me where they were counting hours from – what finally made me notice the error was that both $\theta_{G_0}$ and $\theta_G$ account for the sidereal rotation, and they’re added together in the end).

From $\theta_G$ (GMST), we need to correct for nutation again to get $\Theta_G$ (GAST, or Greenwich Apparent Sidereal Time), and then rotate the Earth by that angle. All of the rotation matrices on the ESA website are inverted (they rotate in the wrong direction), and all of the angles passed in to them are negated to compensate. Converting this all to use standardised quaternions for rotation made things much easier to understand.

Finding our own location

Code link: location.cc

To be able to point at some ECEF coordinates, we first need to know where we are. GPS coordinates are polar coordinates, which give a latitude (the angle away from the equator), a longitude (the angle away from the prime meridian), and an elevation (the height above the Earth’s reference ellipsoid).

The ellipsoid can be defined in terms of a polar radius (from the north or south pole to the centre) and an equatorial radius (from the equator to the centre). Obviously the Earth’s height varies much more than a perfectly smooth ellipse, and even sea level doesn’t quite match the ellipsoid. GPS receivers usually give you both an elevation above mean sea level and an elevation above the reference ellipsoid.

To begin with, it’s worth noting that because we’re not standing on a sphere, a line pointing at a normal to the surface (one of the definitions of “straight down”) doesn’t necessarily point to the centre of the Earth. It’s close, and it does intersect the line going from the north pole to the south pole, but unless we’re actually standing on the equator or the north pole it won’t pass through the exact centre.

The first number we need to calculate is the distance along the normal from our location to the line between the north and south poles. If we call the radius at the equator $a$ and the radius at the poles $b$, this can be calculated as:

$$N(lat) = \frac{a^2}{\sqrt{(a \cdot cos(lat))^2 + (b \cdot sin(lat))^2}}$$

From there, it’s relatively easy to find the $x$ and $y$ coordinates in ECEF:

$$x = (N(lat) + elevation) \cdot cos(lat) \cdot cos(long)$$

$$y = (N(lat) + elevation) \cdot cos(lat) \cdot sin(long)$$

Finding $z$ is slightly more complicated, but doesn’t depend on the longitude:

$$z = \left(\frac{b^2}{a^2} \cdot N(lat) + elevation\right) \cdot sin(lat)$$

Pointing

Code link: cartesian_location.cc (specifically, see directionTowards())

We have the coordinates of ourselves and the Moon, now we just need to point from one of them to the other. We can already calculate a vector, but it’s not very useful without some reference to compare it to. We need to know where that vector is pointing in a local coordinate system.

The two angles we need to point at something are often called the azimuth and the altitude. Note that the altitude here is an angle, and not the height above the ellipsoid that I’ve been calling elevation – it seems like in astronomy the term “altitude” is overloaded, and these two terms are often used interchangeably, so I’ve arbitrarily picked “elevation” for height and “altitude” for this angle.

If you’re standing on the surface of the Earth and pointing, your finger’s azimuth is the angle between north and the direction you are pointing (0° = north, 90° = east, etc.). Your finger’s altitude is the angle upwards from the horizon to it. Using these two angles, we can point towards any object in 3D space.

We need a reference direction: the normal to the surface of the Earth ellipsoid at our current location. This is fairly easy to define based on our latitude and longitude:

$$\vec{up} = \begin{bmatrix} x \cr y \cr z \end{bmatrix} = \begin{bmatrix} cos(lat) \cdot cos(long) \cr cos(lat) \cdot sin(long) \cr sin(lat) \end{bmatrix}$$

Getting the altitude from here is simple, given a direction to point $\vec{D}$:

$$altitude = 90\ -\ arccos(\vec{up} \cdot \vec{D})$$

Finding the azimuth is trickier – the easiest way I’ve found is to first rotate both $\vec{D}$ and $\vec{up}$ to the point where the prime meridian crosses the equator:

$$toPrimeMeridian = atan2(\vec{up}.y, \vec{up}.x)$$

The result from atan2() is the anticlockwise angle from the prime meridian, when viewed from the north pole. Our rotations are clockwise around the axis we choose, and we’re looking from the north pole to the south pole, so to remove this rotation we need to rotate clockwise around the negative Z axis:

$$\vec{up}_{prime\ meridian} = R_{-Z}(toPrimeMeridian) \times \vec{up}$$

Next, let’s put this at the equator. This time we’re looking along the positive Y axis:

$$toEquator = atan2(\vec{up}_{prime\ meridian}.z, \vec{up}_{prime\ meridian}.x)$$

$$\vec{up}_{0,0} = R_Y(toEquator) \times \vec{up}_{prime\ meridian}$$

But we aren’t really interested in $\vec{up}_{0,0}$, we already know which way it’s pointing. What we need are those same rotations applied to $\vec{D}$:

$$\vec{D}_{0,0} = R_Y(toEquator) \times R_{-Z}(toPrimeMeridian) \times \vec{D}$$

And finally we can find our azimuth (converting from anticlockwise from east, to clockwise from north):

$$azimuth = atan2(1, 0)\ -\ atan2(\vec{D}_{0,0}.z, \vec{D}_{0,0}.y)$$

Using the azimuth and altitude together, we can figure out which direction to point in, and all we need as references are an up/down vector, and the direction of north. Putting all of this together, we can point at the Moon!

See https://github.com/abryant/CosmicSignpost for the code. There may be more blog posts coming about the motor control code, electronics, and hardware.

Thanks go to my partner Dalia, who pointed me towards scipy and specifically the unwrap function, and helped with analysing the FFTs of the Moon’s orbital elements.

Thanks also go to my Dad, who showed me how to use sine and cosine amplitudes as a more optimisation-friendly formula for the orbital elements than magnitude and phase.

8x8x8 LED Cube

Leave a reply

A few months ago, I posted about my LED Cube, which was a 4x4x4 grid of 64 LEDs. Since then, I’ve built a version that’s 8 times as large and twice as dense in each dimension, an 8x8x8 cube:

It has 512 LEDs, four separate signal cables coming from a single microcontroller, and 1824 solder joints. The spacing between LEDs is 2cm (compared to the 4x4x4’s 4cm) meaning that the whole thing is around 16cm to a side. From when I got all of the LEDs to when I finished it, it took around 3 months to make, however the actual time spent working on it was probably in the range of around 100 to 200 hours.

It runs on the same server software stack as the 4x4x4, but with a different microcontroller to download the cube states and display them (which requires some different timing logic, and different signal sending code to write to all four signal lines at once). My git repository is here: https://github.com/abryant/LED-Cube.

Most of this post is some fairly detailed instructions about how to build it / how I built it, with some extra information about how it’s controlled at the end.

Building the Cube

The cube is built of 8 layers, each of which is 8 strips, each of which contains 8 LEDs.

A strip

Each LED has four wires: Data out, GND, +5V, Data in, in that order. The first thing to do is connect all of the LEDs to a power supply, which also serves as the structure holding them up. I used a block of polystyrene (which came with the soldering iron) to hold the LEDs in place, with one of the pins of each LED sticking straight down into it:

Here, the longer ground wires are along the bottom, the shorter +5V wires are along the top, and the data pins go up/down in alternating directions (more on that later). The marks are where the last two LEDs needed to go. Next, we need to solder the rails on the +5V and GND lines, which involves straightening the wire. For this, I am using 22 gauge galvanized steel wire:

Cut the wire to length from the reel. This requires two 16cm pieces of wire for a single strip, or 16 pieces for all of the strips in a layer.
Straighten the wire by putting it between two pieces of card (I used a greetings card I had lying around), and pressing down on the card while pulling the wire through with some pliers.
Finish straightening the whole set. The 90 degree bend at the end of the wire is partly for pulling it through the card with the pliers, and partly to secure it to the polystyrene while soldering.

Next, the wire can be placed on top of the LEDs pins for +5V and GND. They need to be very close to the LEDs so that the layers can be 2cm apart later on:

Note that the data pins go in alternating directions: the first LED has its Data In pin going down into the polystyrene, and the second has its Data Out pin going down.You can see this in the following photos. Note that the side of the LED with the flat edge is the side with the Data Out pin, so the order is: “Flat edge, Data Out, GND, +5V, Data In, (Round edge)”

The reason that the LEDs alternate in direction is so that the data pins can all tie together into one continuous line going in one direction. This will be more clear when we solder 8 strips into a layer.

At this point, the strip is finished, and it needs testing because occasionally one of the colours on an LED won’t work. I did this by connecting the strip to a breadboard with a 5V power supply. Each LED draws around 60 milliamps at full brightness, meaning that a strip can draw 480mA, so a good USB power supply should be enough. To test the LEDs, I moved the data cable between the LEDs’ Data In pins while running a test program. The test program is a very simple one that cycles between red, green, and blue every half-second (it ran on a Raspberry Pi, with a USB cable connected to an Arduino Nano that sent the data signal).

Here’s a photo of an old version being tested (this was when I was originally using 4cm gaps between LEDs):

Note that the LEDs hold their colour when the data cable is not connected, so I’ve set them to an RGB pattern by disconnecting the data cable from each at the right time.

A layer

Once we have 8 strips, we can solder them together to make a layer. Here’s a diagram of how they fit together:

This requires careful alignment to make sure each strip is exactly 2cm from its neighbours all along its length.

Use the uncut ends of the GND or +5V pins to stick each strip into the polystyrene. They should all be the same way around, so let’s assume all of the GND pins are facing down. It may take a lot of adjustments and careful measurements to ensure that the 2cm gap is maintained between all of the strips.
Take four more straightened 16cm pieces of wire (with 90 degree kinks at the ends), and lay two of them across the top of the strips, connecting all of the +5V lines. There should be one half-way between the 1st and 2nd LEDs, and another half-way between the 7th and 8th LEDs. When positioning these pieces of wire, make sure that the end with the 90 degree kink is always on the right hand side, and as far away from the LEDs as possible while allowing all 8 joints to be soldered. This will give us some extra wire to work with when we are soldering the layers together later on.
Solder the two pieces of wire down to the strips, making sure to measure the 2cm intervals after every joint. Once this is done, all 8 strips should be attached on the +5V side.
Pull the layer out of the polystyrene and flip it over to do the other two pieces of wire, which will connect the GND lines.
At this point, the layer will have a common +5V plane and a common GND plane, but the data line will not be connected.
To connect the data line, bend the pins into the right positions as shown in the diagram above, and then use some wire cutters to trim each pair (that need to be connected) to length. It’s easiest to trim both of the pair at the same time. After all of them have been trimmed, they can all be soldered together, which completes the first layer.

On alternating layers, the strips need to be made with the data pins going in the opposite direction. This allows the data line to be reversed for the 2nd, 4th, 6th, and 8th layers, which ensures that the signal can go through them after it gets to the end of the 1st, 3rd, 5th, and 7th layers.

Once a layer is finished, it can be tested. Each layer contains 64 LEDs and each LED draws a maximum of 60mA, so the power supply for the test needs to supply 3.84A at 5V. One good test program for a layer is to draw a moving line of colour through the LEDs.

A cube

Once you have 8 layers, they can be connected together into a cube.

But first, they need trimming so that the +5V and GND pins do not interfere with the other layers:

To put the layers together:

Decide which direction you want the layers to connect together in. (I forgot to do this, accidentally connected them from right to left, and had to flip the direction of the layers in software afterwards, because it was the opposite of my 4x4x4 cube.)
Line up the layers in some polystyrene in approximate positions (they don’t need to be accurate yet). The layers should alternate in data signal direction, so that odd layers have their signal going upwards and even ones go downwards.
Put four new pieces of straightened 16cm wire along the GND and +5V lines that are sticking out of the sides. These four wires should be as close to the LEDs as possible without touching any other wires (make sure they’re not too close to the data lines).
The four lines can then be soldered to the first layer. It may be easier to remove the layers from the polystyrene before soldering the second layer, because the alignment is very difficult to get right at this stage, and it can be easier to solder each joint separately.
Solder the rest of the layers onto these four lines, being very careful that they are connected at 2cm intervals, and that the layers are aligned horizontally and vertically.
Solder the Data Out pin at the top of every odd layer to the Data In pin at the top of the next even layer. This will produce four sets of two layers that can be controlled independently. The inputs to each of these four pairs of layers will come from four pins on the microcontroller.

Here’s a cube with only three layers soldered on:

It’s important that two adjacent layers don’t touch, because one layer’s +5V plane is very close to the next layer’s GND plane.

Here’s the full cube after soldering:

Powering the Cube

At this stage, you should have a single cube of 512 RGB LEDs with a common ground and +5V line. However, this cube can draw up to (512 x 60mA =) 30.72 Amps! It will not turn on with a power supply that can’t supply a large current. Additionally, the wire connecting the LEDs together is fairly thin and probably won’t support a full 30.72A of current without heating up a lot. So to turn this on, we need some way to distribute a lot of power around the cube.

The way I decided to do this was to buy two thick pieces of copper wire (one with red insulation, one with black) with enough conductance to support the whole cube. I then stripped the insulation from the middle(-ish) of each piece of wire, and soldered it along the bottom +5V and GND lines connecting the layers together.

I then collected the two ends of each wire and secured them into the terminals on the power supply (I measured my wires so they each had the right amount of slack to reach it). My power supply is a brick that supplies 40A at 5V and has three terminals for +5V and three for GND, so it can cope with the power and logistical requirements.

Controlling the Cube

The cube isn’t very interesting if all it does is show the blue colour that the LEDs default to. The microcontroller attached to it is what allows it to display more interesting patterns, however the microcontroller I use is also fairly simple: it just connects to my server using a wireless internet connection and displays whatever the server tells it to. This allowed me to make a web interface and control the cube from my phone, but it does require a constant internet connection and can eat up some data (at 20 frames per second, it uses around 30 KB/s).

Client hardware

The client I use for the cube is an ESP-12E module, which has an embedded wireless chip and a 160Mhz processor.

As I mentioned in my previous post, the timing for the APA106 LEDs is fairly precise, and needs a signal pulse every 1710 nanoseconds (of either 350ns for a zero bit, or 1360ns for a one bit). This requires some circuitry that can cope with fast changes without suffering from interference.

Annoyingly, the ESP-12E works at 3.3V rather than 5V, so I had to use a voltage regulator to convert the cube’s 5V power supply down to 3.3V first, and then use some transistors to pull the signal wires it outputs back up to a 5V signal (which actually inverts the signal at the same time, so the code has to send 0 when it means 1 and vice versa).

To make sure power supply is smooth and doesn’t suffer from any interference, I have both a ceramic capacitor and an electrolytic capacitor in parallel across it.

To convert the signal from the four data-out pins from 3.3V to 5V, I’m using four n-channel MOSFETs and four 470Ω pull-up resistors. Here’s the circuit diagram (click to zoom):

Client software

The ESP-12E (which uses the ESP8266 chipset) can be programmed as an Arduino device. While it doesn’t support the instruction timing that I used in the Arduino Nano program, it is fast enough that it can count the clock cycles itself. It also has enough spare cycles to calculate four different output signals at once, which decreases the time required to update the cube by a factor of 4, from ~21ms to ~5.25ms. It takes this long because there are 512 LEDs to update, each of them need 24 bits of colour data, and each bit takes 1710 nanoseconds to send: 512 x 24 x 1710ns = 21,012,480ns ~= 21ms. Since interrupts must be disabled while sending the signal to the cube, pausing for 5.25ms is much more feasible, especially when we want to update at 20-30 frames per second and we’re waiting for data from the wireless network at the same time.

There are some really useful Arduino libraries for WiFi management. The one I’m using will auto-connect to the last network it connected to, but if it fails for some reason or can’t connect to the server it will start up its own network with a captive portal that lets you scan for networks and enter the password to one, after which it will connect to that network and continue from there.

The full client code is here:
https://github.com/abryant/LED-Cube/tree/master/arduino

Server software

The server is a multithreaded Python HTTP server. When a request comes in from the cube, it registers that the cube exists and starts a controller that keeps the connection open indefinitely to send frames to. The web interface sends commands via POST requests, and when one comes in the controller responds by changing what the cube is doing, for example by changing the current visual displaying on the cube. When a non-interactive visual is running, the controller just gets the next frame 20 times per second (or at whatever speed the controller has been set to). Interactive visuals are handled differently, and can respond to other POST requests from the web interface.

The visuals themselves are python generator functions (similar to an infinite iterator, but much easier to write) that yield Cube objects or lists of colours. A Cube object is a 3D array of colours with some utility functions that let the visuals do interesting things. The visuals themselves can be quite short pieces of code, for example “Layers” is less than 30 lines, and more complex ones like “Faces” and “Snakes” are less than 100 lines of code. “Starfield” has about 5 lines of actual non-boilerplate code.

The visuals can also be chained together in sequences, using some generator utilities (which is what’s happening in the video above). Whenever a generator yields a boolean value, the generator utilities treat it as a break point and have the option of moving on to another visual, or doing something else.

The server also sends the current state of the cube to any listening instances of the web interface, which then draws it on screen:

This cube was a lot of fun to make, and I’m still experimenting and improving it with software (e.g. new visuals) and hardware (e.g. ways to cool the LEDs when they’re at full brightness). It taught me about soldering (I hadn’t soldered for about a decade), building circuits, Arduino programming, Python web servers and generators, WebGL, and probably lots of other things.

Thanks to everyone who helped me, and to those who came up with ideas for visuals to display on it. Special thanks to my grandfather James M. Bryant without whose help I wouldn’t have known about APA106 LEDs, the ESP-12E, or the various bits of electronics I needed to put the whole thing together.

I have a few ideas for new visuals, but I’m always interested in more – feel free to send me ideas, or even better, write code for them and send it to me!

LED Cube

3 Replies

I’ve been building a cube recently:

It’s built from 64 programmable LEDs (in a 4x4x4 lattice), that I can control with a serial signal from a 16MHz Arduino. More specifically, these are APA106 LEDs, which have a chip embedded inside them that decodes the signal and outputs the programmed colour levels to the LED. They take a 3-byte RGB signal, which in theory means 16,777,216 possible colours (although your eyes can only see around 10 million).

Each LED has four wires: Data out, GND, +5V, Data in (left to right in the picture above). The 5V supply and ground pins are the main power source, and the 64 LEDs in the cube are connected to the supply in parallel. The Data in and Data out pins, however, run in series through the whole string of LEDs – the first LED’s Data out pin connects to the second LED’s Data in, and so on. Only the first LED in the sequence gets a signal from the Arduino, and once it has received its colour it propagates the rest of the serial signal on to the LED after it. Because all of the LEDs work this way, you can just send a sequence like R1,G1,B1,R2,G2,B2,R3,G3,B3,… to update all of the LEDs at once.

In the video above, all 64 LEDs in the cube are being updated 20 times per second. For parts of it, you can see a rainbow pattern weaving its way through each of the LEDs in turn. The order in which the LEDs turn on/off there is the order in which the data pins of the LEDs are connected to each other.

At maximum brightness, each LED draws about 60mA of power at 5V, meaning that the whole cube draws 3.84A. Most USB ports and phone chargers only supply one or two amps, so here I’m using a separate 5V 10A power supply. The Arduino is still powered over USB, so to connect the data wire safely I had to also connect the two ground wires together, but not the 5V.

The timing on the signal needs to be fairly precise, according to the APA106 datasheet the smallest time between signal changes is 350ns with an error of up to ±150ns. Initially, I tried to do this from a Raspberry Pi’s GPIO pins, but my program would always get interrupted in the middle of sending the signal, so I had to use an Arduino instead. I adapted some signal sending code for a similar type of LED I found online, which disables interrupts and uses assembly code for instruction level timing based on the fact that the processor runs at 16MHz.

In order to write visualisations in a higher level language than Arduino C, I wrote a very simple Arduino program that listens for data on USB and forwards it to the LEDs with the correct timing. The USB signal has a maximum baud rate of 115200, meaning that I can send a maximum of 115200/(64x3x8)=75 updates per second to the whole set of 64 LEDs. In practise, I’m only sending 20 per second.

To send those signals and generate interesting patterns, I’ve written some Python programs that run on a Raspberry Pi, which the Arduino is plugged into. These programs generate one frame after another and send them to the Arduino at 0.05 second intervals.

You can find the code, including Python visualisations and the Arduino LED controller, on my GitHub here: https://github.com/abryant/LED-Cube

Some of the visualisations I’ve written so far are:

A rainbow filling up the cube in LED order.
Blocks of colour scrolling past in different directions.
One face of the cube rotating to become another face and changing colour.
A 3D game of snake.
Tetris blocks falling in random places until they reach the top.
Pulsing the whole cube in different colours.
Falling green lights (matrix-style).
A rotating 3D colour wheel.
3 different coloured snakes all avoiding each other.

You can see some more photos and videos here:
https://photos.app.goo.gl/b63fNjHddFoOuM0C2

Unfortunately, some of these don’t work very well because the cube is so small. 4x4x4 isn’t enough to display much detail, for example you can’t display text on it. To remedy that, I’ve started building an 8x8x8 cube with 512 LEDs. Hopefully I’ll put up some more detailed blog posts about the construction and soldering as I go.

Plinth Isn’t Dead!

1 Reply

Update 2018-02-26: This was written several years ago, and Plinth development is essentially dead now. (In retrospect, although it was true at the time, this was a bad title and it’s been annoying me ever since.)

I thought I’d write a status update, since it’s been almost four months since I last posted anything here.

Firstly, Plinth isn’t dead! I have been working on it, albeit slightly more slowly than I’d have liked, for the last four months. At first, that was because Generics took me a while to get right, as it required major structural changes throughout the compiler. More recently, I’ve just been away on a couple of holidays. For the last few days, though, development has sped up significantly.

Now, since this is a status update, here’s a list of all of the major things I’ve added since my last post:

Generics [commit]

This took much longer than it should have – partly because it changed so much of the compiler’s data structure, but also because code generation for it was much more difficult than I expected (see last post).

A Test Framework [commits: 1 2 3 4]

There is now a suite of tests that covers everything up to name resolution. I still need to write tests for: cycle checking, type checking, control flow checking, and exception checking.

If you want to run all of the tests, just run `ant run-tests` from the project root directory.

A new array-initialisation syntax [commit]

Before this change, it was impossible to create an array of anything that didn’t have a default value, unless you knew its size in advance. Here’s an example:

[]string foo = new [n]string;

^ This is impossible – you can’t create an array of not-null strings unless you know all of their values beforehand. Here’s what you could do before:

[]string foo = new []string {"foo", "bar"}; // only has two elements, not n
[]?string bar = new [n]?string;             // all elements are nullable

But now, with this new syntax, you can create them much more easily:

[]string foo = new [n]string("hello"); // initialises all of the strings to "hello"

Or, if you define a function somewhere first:

string makeString(uint index)
{
  return "string number " + index;
}
[]string foo = new [n]string(makeString);

This will become much easier to use once closures are added, which will hopefully be soon.

Wildcard types [commits: 1 2]

Wildcards are useful if you want to store an object but don’t care about its generic type arguments. For example, a List<?> can only store one type inside it, but we don’t care what that type is. We can’t store new elements in it, but we can retrieve them as objects.
Adding wildcards required a lot of changes to the way run-time type information works, mainly because of cases like this one:

ArrayList<?> list = new ArrayList<Bar>();
List<? extends Foo> list2 = cast<List<? extends Foo>> list;

To do this sort of check, we need to go through all of the super-types of the value being tested (list) for the target type (List), and then make sure the constraints on the wildcard type argument encompass the actual type argument we get from the object’s run-time type information. This gets especially complicated when you cast to a type like List<? extends Foo super Bar>, or when you have multiple type arguments, so the runtime’s code for checking this is a bit long (although that’s partly because it’s written in LLVM bitcode).

A copyright notice [commit]

This should have been added a while ago, but the code is now officially under the three-clause BSD license, so you can use it for almost anything you want.

Array bounds checking [commit]

Now, if you index past the end of an array, you won’t get a segmentation fault any more – an IndexError will be thrown instead. Also, IndexError has been added to the standard library.

Various standard-library improvements [commits: 1 2 3 4 5 6]

Some new methods have been added to string, such as trim() which trims all unicode whitespace characters.
string now has a read-only property called “length” instead of a method, which will be the standard way of exposing the length of something in the standard library (no size() methods on Lists!).
Also, the stdin::readLine() method can now throw EOFException.

ArrayList<T> and LinkedList<T> implementations [commits: 1 2 3]

These both implement a new List<T> interface, which extends Iterable<T>.
ArrayList<T> is actually implemented as a ring buffer in an array. I’m not sure why this isn’t more standard in ArrayList implementations, but it makes insertions to and deletions from the start of the list O(1) with hardly any overhead.

For-each loops [commit]

These loops are extremely flexible: they don’t just iterate through arrays and Iterables, they also work on integers (signed and unsigned) and Iterators. This means you can do things like this:

for uint x in 5
{
  stdout::print(" " + x); // prints: 0 1 2 3 4
}
for short y in value
{
  stdout::print(" " + y);
  // if value == -5, prints: 0 -1 -2 -3 -4
  // if value == 0, prints nothing
  // if value == 3, prints: 0 1 2
}

And this:

Iterator<Foo> iter = list.iterator();
if iter.hasNext()
{
  iter.next(); // skip the first element
}
for Foo foo in iter
{
  stdout::println(foo); // starts at the second element of the list
}

Auto-assign parameters [commit]

These are something I’ve wanted Java to have for a while, but they’re purely syntactic sugar. Basically, they let you define a constructor/method(/property setter) parameter which, instead of creating a local variable, just assigns to a field. It’s easier to show it than to describe it. Instead of writing this:

class Item
{
  string name;
  string description;
  uint price;
  create(string name, string description, uint price)
  {
    this.name = name;
    this.description = description;
    this.price = price;
  }
  // ... getters and setters ...
}

You can write this (also, you can use properties instead of writing getters and setters!):

class Item
{
  property string name;
  property string description;
  property uint price;
  create(@name, @description, @price);
}

The auto-assign parameters get assigned just before the body of whatever you’re calling.
With this change, when you’re using an auto-assign parameter a body is added automatically, so you can put a semicolon instead of an empty body.
You can also do some cool things like:

void setHalfPrice(@price)
{
  price /= 2;
}

This works because the parameter doesn’t create a local variable, so when you say “price” you’re actually referencing the property.

That’s as far as I’ve got right now. Hopefully I’ll add default arguments and varargs soon. However, I do need to do some preparation for job interviews, so I’m not sure when my next post will be.

Native Types and Generics

Leave a reply

First, a status update: I have finished most of the runtime code required for generics, which involves several changes to how virtual function tables (VFTs) work, and some changes to run-time type information (RTTI) which will allow you to do instanceof checks and casts. What I have not yet done is to decide on a way of representing generic types in native code; there are several obstacles to overcome when devising this native representation, which I will try to describe in this post.

Simple Native Types

Before we start on generics, here is a brief overview of how native types work.

Each type is given a standard representation, which is always the same. For example, uint becomes a 32 bit integer, or an i32 in LLVM bitcode. Knowing the compile-time types allows us to know exactly which run-time type a value will have. Here are some examples of various native types:

Plinth Type(s)	LLVM Type	Description
`uint`, `int`	`i32`	A 32 bit integer (signed/unsigned doesn’t make a difference).
`boolean`	`i1`	A 1-bit integer.
`?uint`	`{i1, i32}`	A tuple of a 1-bit and a 32-bit integer. If the `i1` is false, the value is `null`.
`object`, `?object`	`{%RTTI, %VFT}*`	An object is just a pointer to a combination of run-time type information and a virtual function table.
`[]ubyte`, `?[]ubyte`	`{%RTTI, %VFT, i32, [0 x i8]}*`	Arrays are pointers to memory where the elements are stored sequentially. They store RTTI and a VFT for easy conversions to/from object, along with the length of the array (`i32`) and the elements themselves (`[0 x i8]`).
`(short, ulong)`	`{i16, i64}`	Tuples are structures which store each of their element values in turn.
`string`	`{%AB, %AI}`	`string` is a compound type which stores two values: an `[]ubyte` (`%AB`) and a `?[]uint` (`%AI`). Compound types are stored in the same way as tuples: using pass-by-value structures.
`class Foo { uint x; }`	`{%RTTI, %VFT, %VFT, i32}`	Classes are represented as pointers, the same way as objects. They have the standard RTTI and object VFT, plus their own VFT for every class and interface in their inheritance tree. They also store each of their fields.
`{long -> boolean}`	`{%RTTI, %opaque, i1 (%opaque, i64)}`	Functions are actually tuples of three values: * An RTTI pointer, for easy conversion to/from object. * An opaque pointer, which is the callee for the function. We don’t care what this points to, just that the function expects it. * A function pointer, which takes the opaque pointer as the first argument, followed by the rest of the arguments, and returns the return type.

This is the pre-generics design for native types. It supports all of the types which can be expressed in the language (apart from void, which just maps to the LLVM void type for function return types).

Simple Type Conversions

One interesting thing about plinth’s type system is that it is unified, so that we can cast any value to an object.

Obviously, some of these types are pointers that start with %RTTI*, %VFT* just like object; these types don’t need to be converted, we can just take the same pointer and treat it as an object value.

Other types, such as uint, do need to be converted. Unfortunately, this involves a heap allocation: we create an object with an extra field on the end (in this case i32), store the value inside that object, and cast it to a normal object. The same thing applies to tuples, compound types, and functions.

Let me just note that dynamic languages (e.g. Python) have this much easier – they could just say that all types look like objects, and not worry so much about converting between them. The tradeoff is that allocating everything on the heap like that is much less efficient.

Generic Native Types

In plinth, generic types can represent any other type, including tuples, functions, arrays, etc. Because of this, we need their native representation to look like something which all other types can be converted to: object.

Now, because all type parameters are represented as objects, we already know how to convert to and from them: just do the normal conversion to/from object, and we’re done! Or are we…?

Generic Functions

Generic functions are functions which take generic arguments, or return generic results. Here’s an example of a class that uses them:

class Comparator<T>
{
  {T, T -> int} f;
}
// ... somewhere else ...
Comparator<string> comp = ...;
{string, string -> int} cf = comp.f; // A
comp.f = cf;                         // B

This Comparator class stores a generic function in a field. It takes two of an arbitrary type T, and returns an int. Considering that this function’s native representation takes two objects, and returns an i32. How do we convert that function into one that takes two strings and returns an i32? Obviously some conversion needs to be done, but we can’t do it until the function is called.

The only solution here is to create a proxy function that takes two strings, and make it do some conversion from string to object, and call the original function. (In native code we do this by packing the original function into an object, and making that object the opaque pointer that gets stored alongside the function pointer. In effect, we now have something like this:

proxy function (string)
       |
       V
original function (object)

But now we get to line B, and this is where things get interesting. Obviously here, we have to do the opposite conversion, and since we don’t know that the function we’re converting is a proxy function, we have to wrap it again in the opposite conversion, as follows:

proxy function (object)
       |
       V
proxy function (string)
       |
       V
original function (object)

Due to two opposing conversions, we now have a chain of functions. If we somehow did these conversions in a loop, we would accidentally create lots of entries in this chain, and then wonder why function calls are so slow. The only way around this is to detect that there’s a chain of proxy functions when we do the conversion, and find out whether the function we were about to add to the top is also somewhere lower down in the chain, and if so, just remove the proxies above it. Quite complicated for an implicit conversion!

But even that isn’t the worst thing we have to consider.

Generic Arrays

Generic arrays are arrays of type parameters (or arrays of tuples containing type parameters). Let’s consider the following class:

class Foo<T>
{
  []T array;
  void set([]T array)
  {
    this.array = array;
  }
}
// ... somewhere else ...
Foo<uint> foo = ...;
foo.set(new []uint {2, 3, 4, 5});

What happens when we want to convert from []uint to []T?

The first thing that springs to mind is that this is an O(n) implicit conversion. This MUST be avoided. Even if the performance penalty wasn’t an issue (which it definitely is), performing this conversion would create a second array, which would completely change the semantics of the function call (from pass-by-pointer to pass-by-value).

So the main constraint is that there must be exactly one array of values. What if the array itself was an []uint? Could we still use it as an []T? Actually, we can – we just need to know how to convert between the real type inside the array and T. If we know (a) whether the array elements are pointers, and (b) the size of each array element, then we can treat the array as a byte array an do some index calculations to get to a specific element, and if necessary box that element up in an object.

This system works for the simple cases like []T, but starts to break down when you consider things like arrays of tuples or functions: converting an []{T, T -> int} to an []{string, string -> int} is much more difficult, but can be done. What we need is some sort of proxy array, which instead of storing the elements itself has a reference to a base array, and provides two functions, ‘get’ and ‘set’, for accessing the array elements. These ‘get’ and ‘set’ functions can do the conversions between the base array’s element type and the proxied array’s element type. So for the example of []{T, T -> int} we might have:

%ProxyArray = type {%RTTI*, %VFT*, i32, %BaseArray*, %GetFunc*, %SetFunc*}
                                             |
                                             V
%BaseArray = type {%RTTI*, %VFT*, i32, [0 x %StringFunction]}

%GetFunc = type %ObjectFunction (%BaseArray*, i32)
%SetFunc = type void (%BaseArray*, i32, %ObjectFunction)

%ObjectFunction = type {%RTTI*, %opaque*, i32 (%opaque*, %object*, %object*)*}
%StringFunction = type {%RTTI*, %opaque*, i32 (%opaque*, %string, %string)*}

Then, whenever the proxy array wants to get/set a value in the base array, it converts the array element between a {T, T -> int} and a {string, string -> int} as described in the function section above (which involves generating a proxy function for every array access/store).

For this to work efficiently, we’d like to know the native type of the array at compile time. To achieve this, we can say that arrays which contain anything generic will be proxy arrays, and arrays of non-generic types will be normal arrays.

The system, while getting increasingly complicated, looks like it’s coming together. However, there’s yet another problem…

Creating Generic Arrays

Our current model says that arrays of generic things will be proxy arrays, and arrays of non-generic things will be normal arrays. If all you want to do is pass arrays around, this works great. However, what happens if you want to create an array of type []{T, T -> int}?

In this model, the answer is: first, create the base array which stores the concrete type (functions which take strings), then build a proxy on top of it so that we have a valid reference to a generic array. But this is impossible. Lets say we can create the base array (it’s not too hard, it only contains function pointers, and we don’t need to know the argument types to allocate memory for a pointer). How do we create the proxy array? We need to, at run-time, create get/set functions which know how to convert between {T, T -> int} and {X, X -> int} where X is the concrete type. This involves writing a proxy function that takes argument X and converts it to a T. We can’t generate code at run-time, and we don’t know what X is at compile time, so we can’t generate code for the required proxy function at all.

In fact, the only way that I can imagine doing this is to allow only one sort of array: an array which has get/set functions. We could generate a base array which uses the get/set functions to store its own values, or we could generate an array which just proxies another array and does some type conversion. Then, in order to create a generic array like []{T, T -> int}, we could just create an array of the generic functions, and proxy it to an array of concrete functions where necessary. This will be slower. Much slower for arrays of primitive types, since we need to do an indirect function call on every access/store. But at least it will work in all cases.

If anyone (a) understands this, and (b) can think of a better solution, please comment and let me know.

Calling Generic Functions

Leave a reply

This week, I finished most of the inheritance checking for generics. However, while moving on to expression checking, I discovered a major syntax problem. Consider this:

<A, B> B foo(?A a)
{
  // foo uses type parameters A and B, takes a ?A, and returns a B
}
// ... somewhere else:
x = foo<uint, string>(7);

This is the syntax I was planning for calls to generic functions. The type arguments are specified after the function’s name and before the actual arguments, which makes for quite a readable syntax. The problem is that it can be parsed in two different ways:

x = a < b, c > (d + e);

Here, x could be either a tuple of two booleans, containing the results of two comparisons, or the result of calling a function called a<b, c> with the argument (d + e). Obviously, this syntax is not going to work.

Here are some of the ways that other languages specify generic types:

C++ and C♯ use the syntax that I was going to use, but they don’t have native support for tuples, so they don’t have this problem.
Java (which doesn’t natively support tuples either) forces the call to be qualified, and puts the type arguments before the name of the method:
this.<String, String>foo("hi");

I considered solving this problem by using the same syntax as Java. However, there is another complicating factor: function types. Function types can be generic, for example we can make bar() return foo:

{<A, B>: ?A -> B} bar()
{
  return foo;
}

// now, we want to write the following, but how do we specify A and B?
bar()(null);

Now, if we were using Java’s syntax, this would make it impossible to specify generic type arguments: doing <uint,string>bar()(null) would be ambiguous: the type arguments could refer to bar or the function it returns.

The way that I’ve decided to handle this is to put the generic type arguments inside the parameter list:

foo(<uint, string>: null);
bar()(<bigint, List<bigint>>: null);

This is slightly harder to read, but completely unambiguous, and works well with the syntax for the type of a generic function. Hopefully, plinth will be able to infer the type arguments in most cases, so that they do not need to be specified at all.

Creating Value Types

There is another problem with this syntax: it isn’t very well suited to creating value types (or compound types, as the syntax currently calls them). Ideally, when creating something, its type arguments should be specified the same way as they are in the type itself. For example:

Foo<uint, string> foo = new Foo<uint, string>();

But creating a value type is actually treated as a function call during parsing, because they look the same to the parser:

bigint b = bigint(4);

Unfortunately, with this new syntax for generics, constructor calls for value types will look nothing like the type being created:

foo<string, uint> a = foo(<string, uint>: ); // (no normal arguments, just type arguments)

As we have already discussed, just moving the type arguments outside the arguments list will not work. The only way to fix this is to use a similar syntax to a class creation: put a keyword at the start of the expression. This keyword should not be ‘new’, because ‘new’ implies that you aren’t creating a new one every time that you pass a value into a method. Instead, I have opted to use ‘create’, as it better illustrates the differences between class and value types:

string s = create string(new []uint {89, 69, 83});
foo<string, uint> a = create foo<string, uint>(s, s.length());

A nice side effect of this new keyword is that it also makes sense for constructors, and is shorter than the current ‘constructor’ keyword. So once this is implemented, constructors will be defined as follows:

class Foo
{
  uint x;
  create(uint x)
  {
    this.x = x;
  }
}

A similar change can be made for property constructors, which eliminates the need for ‘constructor’ to be a keyword at all.

Hiding Methods

7 Replies

This week, I was trying to rewrite the inheritance checker for generics. The inheritance checker is the compiler pass which makes sure that there aren’t any problems with overridden methods, and that only abstract classes can have abstract methods.

The question I was faced with was: Can something inherit from the same generic interface twice, with different type arguments? For example:

interface Processor<T>
{
  void process(T t);
}
class MultiProcessor implements Processor<int>, Processor<string>
{
  void process(int i) {}
  void process(string s) {}
}

This makes conceptual sense, and doesn’t have any ambiguities, so it really should be allowed. Unfortunately, there are two problems that can arise if this is allowed. The first is with the low-level implementation of method overriding, which can be changed; but the second is an ambiguity that can’t be solved without substantial changes to the language. Here’s an example that should illustrate the second problem:

interface List<T>
{
  void add(T t);
  T get(uint index);
  uint size();
}
class MultiList implements List<int>, List<string>
{
  void add(int i) { ... }
  void add(string s) { ... }
  int get(uint index) { ... }
  string get(uint index) { ... }
  // what about size()?
}

MultiList tries to be two different lists at the same time. The add() and get() methods should work fine, because they have different signatures (in plinth, the return type is part of a method’s signature). However, the size() method has the same signature no matter which list it is from. Obviously we don’t want both lists to have the same implementation of size() – we might have 2 ints and 7 strings. In order for everything to still work when we cast from MultiList to List<int> or List<string>, we need to have two different implementations.

But with the current system, if you override a method, it overrides all of the super-type methods with the same signature, making it impossible to have two different size() implementations. The solution is to change the language to allow methods to be hidden instead of overridden. For the unfamiliar, here’s a quick description of the difference:

Hiding is what happens to fields: when you make a new one with the same name in a subtype, the one in the supertype still exists, but you can’t access it from the subtype (if you need to access it, you can cast to the supertype first and access it there).
Overriding is what happens to methods: when you make a new one with the same signature in a subtype, the one in the supertype gets overwritten with your new one, so that everywhere you access it you are referring to the same thing.

What we actually need to do to solve this sort of problem is to hide a method while providing an implementation for it, i.e. override it and hide it at the same time. We could provide implementations for both size() methods to do different things, and then hide one (or both) of them, so that you get two different answers when you do:

(cast<List<int>> multi).size()
(cast<List<string>> multi).size()

When you do multi.size(), you get whichever of them is not hidden, or a compiler error if both of them are hidden.

The syntax I am considering for this is similar to C♯’s “explicit implementation” syntax for interfaces, but more flexible:

class MultiList implements List<int>, List<string>
{
  // ... add() and get() methods ...
  hiding uint List<int>::size() { ... } // overrides List<int>::size()
  uint size() { ... } // overrides List<string>::size()
}

This gives us implementations for each of the size() methods separately, and hides List<int>::size() so that only List<string>::size() is accessible from MultiList. If we want to call List<int>::size(), we must first cast the MultiList to a List<int>.

This syntax will also allow you to hide a method and provide a new method with the same signature, without overriding it. For example:

class Bar
{
  uint parse(string s) { ... }
}
class Foo extends Bar
{
  hiding uint Bar::parse(string s);

  uint parse(string s) { ... }
}

So, in order to hide a super-type’s method, we declare it with the ‘hiding‘ modifier, and instead of just giving its name, we use SuperType::name.

Note: I am not completely decided on the keyword ‘hiding’. While ‘hidden’ might make more sense, I do not expect this feature to be used very often, so I don’t want to use a keyword which people often use as a variable name.

Low-Level Hiding

The other problem I mentioned was with the low-level implementation of overriding. Currently, each method has a “disambiguator”, which is a string representation of its signature, including its name and its parameter and return types. Methods are only overridden by the plinth runtime if their disambiguator strings are equal.

One problem with this way of doing things is that it doesn’t cope with inheriting members from the same super-type twice. To override both List<int>::get() and List<string>::get(), you would need to provide two methods which have different disambiguators, but the super-type only has one disambiguator for List<T>::get(), so they cannot override it properly with the current system.

Another problem is that it doesn’t allow a method to not override a superclass method, unless it picks a deliberately non-matching disambiguator (and doing so would be bad for binary compatibility).

To solve these problems, the way the runtime generates VFTs will have to be dramatically changed in a way I haven’t fully thought out yet. Luckily, this part of the runtime is only run once when the program is starting up, so the minor efficiency tradeoff that this will probably necessitate is almost certainly worth it.

For now, I’ll get back to implementing generics without support for inheriting from the same type multiple times, and support for these features will be added later on.

Generics, Nullability, and Immutability

Leave a reply

The last few weeks, I have been working on Generics. Last week, with some awkward grammar hacks, I implemented a parser for them that allows the ‘>>’ token to close two layers of type argument at a time. But this week I have come to a more difficult problem: how do nullability and immutability (the “type modifiers”) work with generics?

I knew that this would be difficult, so the first thing I considered was what problems I was solving by not just disallowing nullable/immutable type arguments. It quickly becomes obvious that doing this would be a bad idea, as it would prevent you from making a List<?Foo> or a Set<#Foo> (which would make generic collections inferior to arrays!).

Before I go into any more detail, I should explain how generics will work in Plinth. Other languages use several different approaches:

C++ makes each generic type a template, which is instantiated whenever it is needed. This means re-generating the low-level code for the generic type whenever it is used.
Java type-checks everything at compile time, and then erases all of the type information at runtime. This makes it impossible to do several things, like create generic arrays.
C♯ uses a system somewhat similar to Java’s, but keeps the run-time-type-information around, which makes it much more flexible.

The C++ approach makes generic nullability and immutability much easier for the compiler, because before it generates code for a generic function, it already knows exactly what type it is working on. The main problem with this approach is that it requires code to be generated at the point where the type is used. This means that any changes made to a generic type will not take effect until all uses of it are recompiled. This type of system would not work for Plinth, because it makes much stricter guarantees about binary compatibility.

Plinth will use a system very similar to C♯’s, but with the type system expanded to support nullability and immutability. One very minor difference is how Plinth will treat value types: it will implicitly cast them all to object, making them pointer types rather than value types. This is necessary so that Plinth only has to generate code for each generic type/method once.

Nullability

The difficult thing about mixing generics with nullability is that in general you can’t know whether or not a type argument is nullable. The following example should illustrate this:

class Foo<T>
{
  T field;
  void foo()
  {
    object o = field; // error: T might be nullable
    field = null; // error: T might not be nullable
  }
}

Both of the errors would actually become a problem when we provide certain type arguments. For example, the first error is set off when we make a Foo<?object>, and the second is set off by Foo<object>.

The compiler represents this type internally as a “possibly-nullable” type, which exists in between nullable and not-nullable. The type hierarchy says that not-nullable is a subtype of possibly-nullable, and possibly-nullable is a subtype of nullable. This allows you to, for example, convert from a T to a ?object, or to convert from a not-null subtype of T to a T.

This subtype relationship gives an obvious way of specifying to the compiler that you don’t want a type to be nullable: say it extends a (not-nullable) object. Unfortunately, there is no easy way to say that a type parameter should always be nullable, but if this is ever necessary it can be worked around by always using ?T instead of T. For example:

class Bar<T extends object, S>
{
  T notNullableField;
  ?S alwaysNullableField;
}

The full inheritance hierarchy for nullability is:

         nullable
            ^
            |
possibly-nullable
            ^
            |
     not-nullable

A <-- B means A is a super-type of B

Immutability

At first glance, immutability looks very similar to nullability where generics are concerned. It has a very similar subtype relationship: not-immutable is a subtype of possibly-immutable, and possibly-immutable is a subtype of immutable. Immutability just has the added restriction that you can’t downcast from immutable to possibly-immutable or from possibly-immutable to not-immutable.

This is extremely close to a complete model, and almost works just as well as the one for nullability. It is only a tiny edge case which makes it completely different in practise.

An edge case

Inside an immutable function, any fields you access (such as a property’s backing variable) are implicitly considered both final and immutable. So normally, from a method, you wouldn’t be able to return one of your object’s fields without the result type being ‘#Foo’ instead of just ‘Foo’. Also, by default, a property’s getter is an immutable function.

With just these rules, a getter would either be unable to return a non-immutable object, or it could be marked as ‘mutable’, which allows it to return a non-immutable object but stops it from being called from inside an immutable function.

Luckily, there is an extra rule: when you access a property (as with a field), the resulting value is implicitly immutable if the object you are accessing it on is immutable.

This extra rule doesn’t change anything about what you can do with immutable values, but it does allow us to add a special case to property getters. If a property getter is an immutable function, and the property is of some type ‘Foo’, and the getter tries to return a field of type ‘Foo’ (which is considered implicitly immutable because the getter is an immutable function), then the getter is allowed to return the value even though it is immutable.

To model this type of scenario, the compiler has a concept of ‘contextual immutability’. This is an extension to the type system which cannot be represented explicitly in code, but represents what happens if something gets made immutable purely because you are accessing it in an immutable context.

Generic Immutability

On top of contextual immutability, generics adds the concept of a possibly-immutable value. What happens if we combine them?

Say we tried to access a field with a possibly-immutable type in an immutable context. If the type argument turns out to be immutable, then the value should be immutable, but if the type argument turns out to be not-immutable, the result should be contextually immutable. This situation could be called something like ‘immutable-or-contextually-immutable’. Obviously, this makes the type system much more complicated, but it is required in order to allow properties to deal with generic types properly.

The full inheritance hierarchy for immutability is:

                     immutable
                      ^     ^
                     /       \
                    /         \
                   /        immutable-or-contextually-immutable
                  /            ^
contextually-immutable         |
                 ^             |
                  \         possibly-immutable
                   \           ^
                    \         /
                     \       /
                   not-immutable

A <-- B means A is a super-type of B

Of course, the only ones that you can explicitly specify are ‘immutable’ and ‘not-immutable’.

Plinth Status Update

Leave a reply

This week, I thought I’d give an update on which parts of Plinth are finished and what still needs doing.

Last Tuesday, I finished implementing properties. They work as described in the last few blog posts, with a few extra considerations for initialising inherited properties. Since then, I have been fixing various small omissions and other problems, such as an edge case in inheritance checking, and making sure value-typed variables do not alias each other.

I have also added support for “super.foo” to access fields and methods on a super-class/interface. For methods, this results in a non-virtual method call.

There are several major things which still need implementing:

Generics
Enums
Default and Variadic Arguments
For-each loops
Closures
Access Specifiers (public, protected, package, and private)
Standard Library
Garbage Collector

The next thing I will be working on is Generics, which will support using any type as a type argument, including things like primitives and functions. Generic types will not be erased at run-time; instead, run-time-type-information pointers will be passed into functions and stored inside objects, so that the type can still be used at run-time.

After Generics, the biggest barriers to the language being useful are the standard library and the garbage collector. The standard library includes (at a minimum) various data structures and APIs for using files and network sockets, which should not be too difficult to implement. On the other hand, the garbage collector is a difficult problem to solve efficiently in theory, especially in concurrent scenarios, so I am expecting it to take quite a while.

While these features are being implemented, I am writing documentation for the language, including a language specification. It is very much a work-in-progress, but it is located here: http://bryants.eu/plinth/

If you want to try out the language, some of the commits over the last week have made it much easier. There’s now an ant build script which will compile the compiler, runtime, and standard library. There’s also a bash script which greatly simplifies running the compiler. Everything you need to know should be in the readme on GitHub, but if there’s anything else you’d like to know, please ask.