Category Archives: Uncategorized

Shift-reduce conflicts

My principles while writing the grammar for this language have been:

  • Do not change the syntax to suit the grammar, unless the proposed new syntax makes more sense than the original.
  • Do as much work as possible during the parsing phase: do not fix a parsing problem by adding rules which add more work to the semantic analysis phase.

The result of this should be a cleaner, easier to use language. However it can cause problems while writing the grammar, usually in the form of shift-reduce conflicts. Some of the conflicts I’ve encountered are below.

Current problem

My current status in the language is that I’ve just started on statement parsing (having finished expressions). However, there’s one problem which looks like it’s very difficult to solve nicely while leaving the syntax unchanged.

The problem is that the start of a tuple type can look exactly like the start of a parenthesised expression. To illustrate the problem, here is an example of what a statement that starts with tuple type looks like:

(Comparable<String>, int, String) x, y, z = foo();

and here is an example of what a statement which starts with a parenthesised expression can look like:

(a < b ? foo : bar).test();

You can quite easily see that both of these start with a left parenthesis, followed by a Name, followed by a left angle bracket. Also, they are both statements, and are allowed in exactly the same places in a program.

The problem here is what the parser does once it gets to the left angle bracket. It can either shift the Name (which has now been reduced to a QName, or qualified name) onto the parse stack in anticipation of the first scenario (a list of type parameters), or it can reduce it into a Field Access Expression, in anticipation of a comparison with some other expression. This is a shift-reduce conflict, and makes it impossible to parse the language without rewriting part of the grammar. In this case, it will be difficult to rewrite the grammar because it spans so much of the Expression and Type hierarchies.

Other (solved) problems

I have encountered a few other problems while writing the grammar for this language. Here are some of them.

Comparison vs Shift Expression

If “>>” is actually two right angle bracket tokens, then it becomes hard to disambiguate between a shift expression and relational (comparison) expression; the following statements are hard to decide between:

int a = b >> c;
int a = b > c ? 8 : 16;

In the case of finding something followed by a right angle bracket, the parser can either shift and expect another angle bracket for a shift expression, or it can reduce and expect a relational expression. The easy way to fix this is to make “>>” into a single token. This, however, causes another problem. Type parameters can easily end in “>>”, for example:

Iterator<Comparable<String>> iterator = blah();

To fix this, a set of rules for type parameters had to be devised so that the programmer doesn’t have to add a space between the two angle brackets. It involves rules for TypeParameterListRAngle, which is a type parameter list with a single trailing right angle bracket. The solution requires the rules for both PointerType and Type to have variants which end in right angle brackets, along with TypeParameter, WildcardTypeParameter, TypeArgument, and others (the grammar is at the end of this post).

Type vs Expression

This problem was to do with closure creation. The original syntax for closure creation was:

{int -> String throws Exception} intToString = String closure(int x) throws Exception
{
  throw new NotImplementedYetException();
}

But this caused the same conflict as the one I am currently working on. When this happened, I didn’t realize that there were other situations where Type and Expression could both occur in the same place. I now realize that the problem I am currently working on is a more general version of this one, and once it is fixed, the old syntax for closures would work just as well.

However, because I didn’t realize at the time, I came up with this new syntax for closures:

{int -> String throws Exception} intToString = closure(int x -> String) throws Exception
{
  return "" + x;
}

The only reason I switched to this syntax instead of solving the problem was that I thought that, by using the arrow to indicate the return type list, this was more in keeping with the closure’s type signature.

Static Initializer vs Field/Method definition

Static initializers start with the same thing as a field/method definition, a member header. This is basically a combination of an optional access specifier and an optional list of modifiers. The reason for this is to avoid shift-reduce conflicts between the static keyword and the member header on other members. A bad member header for a static initializer (anything but a single “static” modifier) is detected and causes an error during parsing.

The difference between static initializers and fields/methods starts immediately after a member header. A static initializer starts with a block, but a field/method starts with a type, which could be a closure type. Since blocks and closure types both start with open braces, the start of a statement could also be the start of a list of types. This can again cause the Type vs Expression problem, but once that is solved a comma could indicate an assignee list as part of an assignment statement, or a type list as part of a closure type. This may be relatively easily solved by using a right recursive list for both types and assignees, but there is an alternative syntax which would make things slightly easier for the parser:

class Foo
{
  static Foo
  {
    String a, b = "hello", "world!";
  }
}

My argument for this syntax is that it makes it look slightly more like a constructor, which also does initialization. However, it does necessitate more typing (when you actually use a static initializer). So although this has currently been solved by changing the syntax, I would be happy to change it back again if anyone presents a reasonable argument for the old (or indeed any other!) syntax.

Grammar

Here is the current Grammar for the language. It still has conflicts and there are some things haven’t been implemented yet, but feel free to have a look anyway. When I have finished the grammar I will likely attach the final version to a blog post.

Generics/Templates

I’ve been thinking more about the generics/templates problem today. I can think of two ways of doing it, each having its own advantages and disadvantages.

Templating

Doing things this way would mean checking the types with the generic type arguments, and then creating a new specialization in machine code for each combination of type parameters. The type parameters would only apply to instances of a class (static members cannot relate to the type parameters).

This has the advantage that separate methods are created for each specialization of the type parameters, which means that if someone uses a tuple as a generic type parameter, it does not have to be autoboxed into a pointer type to be passed to the method (a win for performance, since it requires fewer heap allocations).

It also has a major disadvantage. If a library defines a generic class, then any code using that library will have to include its specializations as part of its own machine code. This could cause problems for dynamically linked programs, for example in the following program:

class LibStack<E>
{
  E[] stack = new E[0];
  void push(E e) { ... }
  E pop() { ... }
  E peek()
  {
    return stack[getFrontPos(stack.length)];
  }
  static int getFrontPos(int size)
  {
    return size - 1;
  }
}

(Try to ignore the bad implementation for a minute, it helps me to demonstrate the problem)

So far, this works fine and an application writer has decided to use LibStack<Int> in their application, which generates the specialization for Int inside their executable. This works great until the library writers decide to change the front position of the stack to be the start of the array.

static int getFrontPos(int size)
{
  return 0;
}

This piece of code is changed, along with the code for push(), peek() and pop(). Unfortunately, the new code for push(), peek() and pop() will only reach the application’s code when it is recompiled. So the static method will be changed (as it is loaded dynamically from the library), but the push, peek and pop methods are stored in the application’s code, and so the old versions are used. This results in application code such as the following breaking:

void main()
{
  LibStack<Int> stack = new LibStack<Int>();
  stack.push(new Int(1));
  stack.push(new Int(2));
  stack.push(new Int(3));
  println(stack.peek()); // not necessarily how std-out will be implemented in the language
}

With the old version of the library, this will print 3 as intended, however the same binary loading the new version of the library will print 1 instead.

Generics with Run Time Type Information

Doing things this way would provide the same interface to the programmer. The class/method/whatever would be type-checked with generic type arguments, and the type parameters for classes would only apply to the instances of a class, not the static members. The difference is that this way, only one copy of the class and its methods is created in machine code.

This has the advantage that additional machine code is not generated for each specialization, and so it avoids the problem with templates above as well as lowering code size.

The disadvantage of this way of doing things is that each function can only take a certain number of arguments, so tuples and other primitive types must be autoboxed into a pointer type before they can be used by a generic class/method/whatever. This boxing causes a heap allocation (essentially it would internally call a method with ‘new BoxedIntStringLongTuple(a, b, c)’ instead of just ‘(a, b, c)’ as the parameter).

This method currently sounds very much like Java’s Generics, which have type erasure so that you cannot, for example, create an array of a generic type parameter. However there is something that can be done about this: Run Time Type Information. Basically, this involves passing information about the type parameter into the classes/methods/etc that use it. So a generic class would have generic type information embedded into it, and methods of that class could query this type information to create an array of that type (of course, this would really be a boxed type if a primitive type such as a tuple or a closure was used).

Autoboxing (which causes lots of heap allocations) also provides us with an advantage. If only pointer types are used in generics, then all functions that use generic type parameters have the opportunity to call standard Object functions on them; that’s right: equals(), hashCode() and others can be provided for boxed tuples, closures, and other primitive types. These functions would have to be implemented sensibly. For tuples, they should only delegate their calculations to the values the tuple contains. But if you want to create a Set<{ -> void}>, or a Map<(int, {int -> String}), String>, it should be possible.

Again, these are only ideas, comments/suggestions for improvements are appreciated.

Language Update and Syntax Fragments

I’ve been working on my language more recently. Most importantly, I’ve implemented an LALR(1) parser generator to work with. Now that it’s working, I can work on the grammar and AST (Abstract Syntax Tree).

So far, I have the start of a parser that will take a list of tokens (the tokenizer currently just gives a constant list of tokens) and parses it directly into an AST. Currently, it can generate ASTs for the top level of the grammar, as far as the following:

CompilationUnit = epsilon
                | CompilationUnit PackageDeclaration
                | CompilationUnit ImportDeclaration
                | CompilationUnit TypeDefinition
PackageDeclaration = package QName ;
ImportDeclaration = import QName ;
                  | import QName . * ;
                  | import static QName ;
                  | import static QName . * ;
TypeDefinition = ClassDefinition
               | InterfaceDefinition
AccessSpecifier = public | private | package | protected
                | package protected | protected package
                | epsilon
Modifiers = Modifiers Modifier | epsilon
Modifier = static | abstract | final | immutable | synchronized
         | NativeSpecification | transient | volatile
NativeSpecification = native ( StringLiteral )
ClassDefinition     = AccessSpecifier Modifiers class Name ClassExtendsClause ImplementsClause { MemberList }
InterfaceDefinition = AccessSpecifier Modifiers interface Name InterfaceExtendsClause { MemberList }
ClassExtendsClause     = extends PointerType | epsilon
ImplementsClause       = implements InterfaceList | epsilon
InterfaceExtendsClause = extends InterfaceList | epsilon
InterfaceList          = InterfaceList , PointerType | PointerType
QName = Name | QName . Name

That’s just a fairly basic representation of the top level of the language. It’s quite similar to Java in a few respects, but it also has some noticeable differences.

Another thing I’ve been doing is thinking about how certain other parts of the language will work. You’ll notice that there are no generics/templates in the above grammar – this is because I haven’t even fully decided on the syntax or the semantics yet. So here are a few thoughts about various things:

Closure Types

{A -> B} f;
{A, B -> void} g;
{ -> void} run;
{A, B, C -> D, E} h;
{int, String -> String, int} swap;
{File -> String throws IOException} read;
{File, {String -> String[]} -> String[] throws IOException} splitFile;

This is currently my favourite choice for the syntax of representing the type of a closure (or any method pointer). Closures can handle exceptions, which is important if they are to be assigned from methods. They also support tuples for both parameters and results. This is important, as all function arguments and results will be tuples in this language. This allows any set of values to be passed in or out of a function.

They do not, however, support default arguments. Default arguments is really just going to be a syntactic sugar applied when calling certain types of functions, and you cannot find the values of the default arguments without also being able to see which function is being called. This restriction will also affect inheritance of default argument functions.

Closure Creation

Closures can be created in two different ways. The first, assigning from a method, is trivially simple:

(String, int) swap(int x, String y)
{
   return y, x;
}
...in some other piece of code...
{int, String -> String, int} swapValues = swap; // if there are multiple definitions of swap(), this infers the correct one

The other way, creating an anonymous closure, is slightly more involved:

{int, String -> String, int} swapClosure = (int, String) closure(String a, int b)
{
   return b, a;
}; // the semicolon is required as the closure creation is part of the expression, not its own statement

This creates an anonymous closure using approximately the same syntax as a method definition. The differences are that access specifiers and modifiers are not necessary, and that the anonymous closure specification does not have a name, it uses the ‘closure’ keyword in place of one because a closure has no use for a name before it is assigned to a variable.

Templates/Generics

Currently, I’m not sure how this part of the type system should be implemented. There are, however, a few constraints on them:

  • A type parameter must be able to represent:
    • A single class/interface type
    • A primitive type (int, long, double, boolean etc.)
    • A tuple of multiple different types. (facilitates maps where the key/value is a tuple, among other things)
    • A closure (method pointer)
    • Nothing at all (like the Void (with a capital V) type in Java)
  • It must also be able to support some restrictions on type parameters, such as:
    • X is a type which extends/implements some base class/interface Y
    • X is a superclass of some other type Y
  • It must also support wildcards, so that, for example, a method can take a Set<? extends SetItem>

Also, it would be much nicer if it didn’t use type erasure, so that the type information is available at runtime, and so that new instances of a type argument and arrays of instances of a type argument can be created.

It is likely that there will be problems with some of these constraints (there will have to be some boxing of tuples for generic function calls), but having such an expressive type system would be a great advantage in a new language.

If anyone’s reading this and has a suggestion on how to do any of this better (or how to implement templates/generics at all) please don’t hesitate to comment below or email me.

Mobile GMail and the N900

I’ve had my new N900 for just over a month now, and it seems to be working very well. There are a few little annoyances though.

The N900 has a full web browser with most of the features you use on a desktop; it even has flash, so I can watch youtube without a phone-specific app. This is really useful, but a couple of sites spoil it slightly by detecting the user agent of the browser (MicroB, based on firefox) and redirecting it to a mobile version of their page. I’d rather use the standard version of these sites in most cases.

The main site that does this is GMail. It automatically redirects mobile phones to their mobile UI (which is here). There is a way to fix this without changing your user agent, just go to this page instead:

https://mail.google.com/mail/?ui=2

This takes you to the standard version of GMail no matter what your user agent is. However, if you’re not logged in it will take you back to the mobile login page, which takes you to the mobile site again when you log in. To fix this, use the following link – it takes you to a login page which redirects you to the standard GMail without checking the user agent.

https://www.google.com/accounts/ServiceLogin?service=mail&passive=true&continue=http%3A%2F%2Fmail.google.com%2Fmail%2F%3Fui%3D2

I hope this helps someone.

Thanks go to the people on the Maemo forums for finding this workaround (here’s the thread: http://talk.maemo.org/showthread.php?t=55888)

Views on the Digital Economy Bill

I recently took part in the Open Rights Group’s campaign against disconnection from the internet (which is a major part of the Digital Economy Bill). Today I received the last email back from the candidates I wrote to. My original letters to Martin Horwood (Liberal Democrats) and Mark Coote (Conservatives) are here and here (the letters are identical, only the names at the top differ).

The Liberal Democrat candidate replied first, early on Monday morning. The Conservative candidate replied later, on Tuesday evening. (This may not be very relevant, I just added it for the sake of openness).

The replies are as follows. First the Liberal Democrat candidate:

Dear Anthony,

Thank you for contacting Martin about the Digital Economy Act. The Liberal Democrats voted against the bill, and were not supported by the Conservatives, who abstained on the vote, scuppering any chance of effective opposition to the bill.

We have been highly critical about the so called “wash-up” process which has enabled this Bill to pass with limited Parliamentary scrutiny before the General Election. The “wash-up” of the Digital Economy Bill was essentially a carve up between the Labour and Conservative parties that ignored Liberal Democrat arguments to consult more widely before introducing a measure to introduce web-blocking for copyright infringement. Liberal Democrats voted against the Bill at 3rd Reading in the House of Commons and against the Labour and Conservatives web-blocking amendment in both the Lords and the Commons.

Liberal Democrats remain to be convinced about the necessity for technical measures, which could include disconnection from the internet. Liberal Democrats were successful in getting the Government to agree to a period of at least a year in which no technical measures can be considered and then to undertake a process of rigorous analysis and consultation into the need for any such measures. We also believe that the music, film and other content industries must work more urgently to develop easy and affordable ways for people to legally access their products.

The recent Liberal Democrat conference in March voted to establish a party working group to look into further detail about the issues raised by the Bill.

Nick Clegg has vowed to repeal the Act if the Liberal Democrats are involved in the next government and give the issues proper time and consultation. “We did our best to prevent the Digital Economy Bill being rushed through at the last moment. It badly needed more debate and amendment, and we are extremely worried that it will now lead to completely innocent people having their internet connections cut off,” said Clegg. “It was far too heavily weighted in favour of the big corporations and those who are worried about too much information becoming available. It badly needs to be repealed, and the issues revisited.”

If Martin is re-elected as your MP he will join his Liberal Democrat colleagues in fighting to repeal this Act and allow due consideration for the issues. He would also be interested in attending Eric Joyce’s meeting on this issue in the new parliament.

If you have any further questions, please let me know.

Best wishes,

Victoria White

Martin Horwood’s campaign for Cheltenham

So the Liberal Democrats are against disconnection from the internet, instead preferring that content providers provide better ways of accessing their products. In my opinion, this is the way to go. Internet Service Providers (ISPs) should not have to police copyright infringement, as the Digital Economy Bill requires them to (Also, they are not equipped to deal with it, because of things like protocol encryption).

The Conservative candidate’s response is as follows:

Dear Anthony

Thank you for getting in touch about the Digital Economy Act. I certainly share your anger about how the Government rushed through such an important piece of legislation in the dying days of the last Parliament. There was absolutely no reason why they couldn’t have introduced the Bill earlier into the House of Commons, where MPs would then have been able to debate it at length. It just shows the contempt in which they hold both Parliament and the industries affected by this Act.

There can be no doubt that the legislation raised serious questions both for consumers and rights holders. That’s why we took steps to improve and strengthen the legislation in Parliament. Conservatives in the Lords sought to ensure the Bill contained a clear appeals process and robust standards of evidence where allegations of piracy are made. In the Commons, we required the Government to remove inadequate proposals for dealing with orphan works and abandon plans to establish worrying powers to amend copyright law by secondary legislation.

As you know my Party did support many of the measures in this Act. For instance a single age rating system for video games was needed to help parents understand what sort of games are appropriate for their children, and Channel 4 needed an updated remit. Most significantly, something needed to be done to try to reduce online piracy which was estimated to have cost the UK 39,000 jobs in 2008 alone.

We believe that the Act sets up a proportionate and measured response to this problem and that it contains sufficient safeguards through an appeals process and Parliamentary scrutiny for consumers to be protected. It is important to note that only the most serious and consistent offenders will face the threat of disconnection and this will only be done after they have received numerous letters and gone through an appeals process. So although I understand the strength of feeling on this proposal I do not want to rule out temporary disconnection. I will listen to the debates in Parliament and closely follow the drawing up of the codes that will govern this process to make sure that the interests of legitimate users are upheld.

In terms of website blocking I believe that in some circumstances such measures may well be needed. It cannot be right that websites are set up purely to make money by facilitating online piracy. Again though, these proposals will be consulted on and debated in Parliament so I look forward to taking part in that process.

I am also keen to ensure that the copyright system reflects the realities of the modern world. We have repeatedly urged the Government to introduce a ‘format shifting’ exception to decriminalise music listeners who convert legally-owned CDs to other digital formats. This was one of many recommendations in the 2006 Gowers Review which the Labour Government failed to implement.

Ultimately, the UK needs a robust copyright system that provides the right balance of freedom for users and incentives for creators. With the rise of online piracy the Government has clearly failed to maintain that balance, and I believe we have now reached the stage where clear and considered legislation is required.

Although I understand that we may not agree on this issue I am very happy to consider all aspects of this important debate. As such I would be more than happy to attend any meeting on this subject should I be elected on 6th May.

Kind regards

Mark Coote

Prospective Parliamentary Candidate for Cheltenham

So the conservatives are in favour of disconnection from the internet, citing job losses. They do not have any response to my statement on businesses providing wifi hotspots, which could easily become very rare if people used them to download copyrighted material. (Note that although the Liberal Democrats did not mention this either, they were against disconnection anyway).

Also, I am unsure if the figure they give for job losses is entirely accurate. The only source I can find stating it is the Sun, and even there it is only mentioned in the title. Something these sort of statistics often do not take account of is that illegal downloads do not necessarily equate to missed sales – people could easily download something for free that they would never consider buying.

On the subject of web blocking, things look the same again. The Liberal Democrats are against it, and the Conservatives are in favour of it.
From a purely technical perspective, web blocking is impossible. It can be easily bypassed by using a proxy in another country (e.g. http://proxy.org/)
From a philosophical perspective, web blocking sounds like a bad idea. Cutting down people’s ability to communicate via the web is an imposition on free speech.

Suffice to say, I know who I would be voting for in this election had my postal vote application not been late.

Language X

Recently I’ve been thinking of what my ideal language would be like. I really like Java, but there are lots of little annoyances that you find in everyday usage. Something I’ve heard a few people complain about its verboseness, an example of this is that every method and member must be declared public or private, lest you accidentally leave it package private and have bad encapsulation. To combat this and other problems, here is a preliminary design for Language X:

Java-like syntax
Basically, method calls, operators, and other things from Java would work with Language X. Hopefully to the point that it would be completely source compatible and you could compile Java programs with X.
The only exception I will make here is that method and variable identifiers *must* start with a lower case letter, and type names *must* start with an upper case letter. This is a common style in Java and makes things much easier to read, however some code doesn’t keep to these conventions and it gets very annoying very quickly. Therefore, case is enforced for the first character.

Native code compilation
Nothing much to say here, basically it would compile to native code instead of some bytecode language like Java or C# do. The fact that it is not interpreted would hopefully make it much faster.

Access specifiers
In my opinion, Java access specifiers are done wrong. This is not because the levels are badly organised (package being more restrictive than protected can be nice in certain circumstances) but because the defaults are not sensible. In Language X:

  • Member variables would be private by default
  • Methods would be public by default (this includes getters and setters for properties)
  • Package would have its own keyword “package” (although this could cause slight problems with the grammar if “package” is also used for declaring which package something is in)
  • Protected would not include the package access level, and package and protected could be combined if desired.
  • Classes would be at the package level by default (I’m not completely sure on this one, public might be a better default)

Properties
A way of defining properties, to make getters and setters obsolete. To take a nonsensical example, defining properties would go something like:

class X
{
    // automatic getter and setter
    property String firstName get set;
    property String surname get set;
    // custom getter
    property String fullName get { return firstName + " " + surname; };
    int seconds; // just a member variable
    // custom setter
    property int hours
    get { return seconds / 3600; }
    set { seconds = value * 3600; };
}

Note that in fullName and hours, the backing variable is not used. The compiler should be able to detect this and automatically eliminate this if possible (however this could bring binary compatibility concerns).
Using these properties would be as follows:

X x = new X();
x.firstName = "Anthony";
x.surname = "Bryant";
x.hours = 3;
System.out.println("Full name: " + x.fullName + ", " + x.hours + " hours");

Default Arguments
Default arguments would exist in this language, and be done properly (unlike in C++). The reason languages like C++ do them badly is that you can only leave off the last argument(s), not arguments in the middle of the list. To correct this, default arguments would be given a name that must be specified by the caller if the default is to be overridden:

class X
{
    int foo(int x, int @y=5, int @z=7) { return x+y+z; }
}

which could be called as follows:

int a = x.foo(1, @z=9); // override only z, not y
x.foo(6, @y=a); // pass a variable into y, leave z as its default
x.foo(2010, @z=(12*(1+4)), @y=9*a);  // specify y and z in a different order

This could even allow the compiler to let default arguments to be specified anywhere in the parameter list, although it may be easier to read if they are only allowed at the end.

Tuples
Tuples would be supported in X, they would be simple data structures that are passed by value like primitive data types. This would allow you to easily return two values from a method without defining a new object to store them, or just swap two variables very easily. A method return type could be as follows:

class X
{
    (String, int) foo()
    {
        return ("a", 12);
    }
}

And it could be used like this:

X x = new X();
// creates a new tuple 'a' and assigns the result of foo() to it
(String, int) a = foo();
// creates two new variables, text and num, and assigns the result of foo() to them
(String, int) (text, num) = foo();
// swapping variables
int x = 12; int y = -1; int z = 5;
// creates a tuple containing y and x, and assigns it to another tuple containing x and y
(x,y) = (y,x);
// moves the values between x, y, and z
(x,y,z) = (z,x,y);

Closures
[Update 24/03/2010: This will need a rethink to work properly with generics. Partial application support would still be nice, but function type signatures may have to be changed.]
Closures in Language X would be haskell-like, and support partial application and currying. The best way to explain is by example:

{String -> int -> void} a = void closure(String text, int num)
{
 System.out.println(text + " " + num);
}

This can now be called as follows:

// calls the whole function at a time, returns void
a("text", 12);
// calls the function on only the first argument, producing another closure
{int -> void} b = a("text");

The internal structure of this sub-closure would be something like:

{int -> void} b = void closure(int num)
{
    a("text", num);
}

Within which it would store final references to ‘a’ and the string “text” that was passed into a when it was generated.
The way closures are defined also means it is possible to pass closures into other closures:

{int -> int -> {int -> String} -> String} a =
void closure(int start, int end, {int -> String} function)
{
    String s = "";
    for (int i = start; i < end; i++)
    {
        s += function(i) + (i == end - 1 ? "" : " ");
    }
    return s;
}

Also, methods can be assigned to closures, as a type of function pointer. This allows you to do:

class X
{
    void foo(String s) { System.out.println(s); }
    void bar({String -> void} f, String s) { f(s); }
    void baz()
    {
        bar(foo);
        {String -> void} func = foo;
        bar(func);
    }
}

An addition to this that could be useful under certain conditions would be the ability to cast a function which takes n arguments into a function which takes n+k arguments and does not use the last k. This would have to be an explicit cast, as it is a very obscure feature. However, it could be very useful if you want to use a function as an event listener but do not care about some of the parameters.

Immutability
Language X would also have compile-time support for immutability. This is one of the features I really like about C++, although making everything const-correct is very difficult to do, and the const keyword is too long to put in front of every immutable object. Therefore, references to immutable objects would be marked as such in the type declaration by a ‘#’ symbol. Method definitions could be marked by the ‘immutable’ keyword, meaning that this method can be called on immutable references to objects. Entire classes could also be marked with the ‘immutable’ keyword, signifying that all methods of this class are immutable. With regards to properties, getters would always be immutable methods, while setters would always be non-immutable. Immutability would work something like the following (properties are not used so that the point of immutability can be illustrated):

class X
{
    int x = 0;
    immutable int getX() { return x; }
    void setX(int x) { this.x = x; }
    // Compiler error: cannot alter a variable from an immutable method
    // immutable void alterX() { x++; }
}

This type of class would be used as follows:

X x = new X();
x.setX(4); 
#X immX = x; // casts x to immutable implicitly
immX.getX(); // returns 4
x.setX(2);
immX.getX(); // returns 2, as x still points to the same object as immX
// This would be a compile time error, you cannot call
// non-immutable methods on immutable references to an object:
// immX.setX(1);

This would seem to be fine, however what happens when someone wants to do some caching? Maybe a function call takes a long time, but once the result has been retrieved it won’t ever change. This is the type of edge case where const casting comes into C++, however this allows library users to hack a piece of code they are using and cause problems.
Instead, I propose a ‘mutable’ keyword: when a member variable is marked as mutable, it can be altered by immutable methods. This does of course allow a library writer to store a reference to ‘this’ in a mutable variable, however setting this edge case aside, the user of a library will not be able to mess around with const-correctness without changing the library. The mutable keyword would be used as follows:

class X
{
    mutable double resultCache = -1;
    immutable double longCalculation()
    {
        if (resultCache > 0)
        {
            return resultCache;
        }
        double result = /* some calculation which takes a long time */;
        // store the result in the cache, which is mutable and therefore allowed
        resultCache = result;
        return result;
    }
}

This class could be used as follows:

#X x = new X(); 
x.longCalculation(); // takes a while, first time being run
x.longCalculation(); // returns almost instantly, cached result

Conceptually, this is much nicer than in the alternative in that it allows the programmer more freedom, and immutable objects do not have to be completely pure in order to be treated as immutable.

Script files
Script files basically do exactly what you would expect. Instead of writing:

class X
{
    static void main(String[] args)
    {
        // do some stuff
    }
}

You could write:

// do some stuff

This would just give you exactly the same encapsulating definition as in the first example, but you can leave off all of the cruft before and after the code you want to write.
The only real limitations here are that (a) The command line arguments are already specified as being called args; (b) The class name is already specified as being the basename of the file that the code was defined in; and (c) the script is assumed to be in the default package. Also, it would have a different file extension from normal code files.

Binary Compatibility
This is a very important point. C++, because it is compiled to native code, can give library writers many limitations. These include not being able to add more virtual functions or change the order of virtual functions. These should hopefully be possible to solve by (a) sorting the methods of a class before writing the virtual function table, and (b) adding a ‘since’ construct. The since construct would be a way of marking new methods and member variables; methods with a higher since definition would get added later on in the virtual function table. Here is an example:

class X
{
    void foo() {}
    void foo(int x) { System.out.println("x=" + x); }
    since(1.8.14) void bar(int x) { foo(x*2); }
}

Users who have the old version of X, which only includes foo() and foo(int), will obviously only be able to call those functions. Users who have version 1.8.14 or later will be able to call bar(int) as well, however users who have version 1.8.14 or later and are running software that only assumes they have the old version will still be able to call foo() and foo(int).
This may not seem surprising, but in C++ binary compatibility can be a nightmare because this is impossible. If sorting functions into a better order and playing around with the virtual function table slightly can solve this, then it should be done to make library writers lives much easier.

Native Interface
The native interface would be much simpler than, for example, JNI. Every method that can be called from native code must be marked in the actual code with a “native” keyword and an associated linker name. Similarly, a function that has a native implementation must have the same specifier. An example of a native interface could then be:

class X
{
    native("X_foo") int foo(String s);
    native("X_getRandomNumber") int getRandomNumber() { return 4; }
}

And this could be used in native C code by doing something like:

xlang_int X_foo(xlang_object s)
{
    // a call back up to the String class
    // by the naming convention, this would call s.getChars()
    xlang_char *chars = String_getChars(s);
    printf("%s", chars);
    return X_getRandomNumber();
}

And that is a basic definition of some features I would like to see in a new language. If I come up with anything else I’ll probably update this post. I may try to implement it if I ever get around to it, but don’t expect anything any time soon. Thanks for reading!

zsh prompt

I’ve just spent an inordinate amount of time writing a zsh prompt:

export PROMPT='%{%(!.$fg[red].$fg[yellow])%}${(r:1:)${(%):-%~}}$([
[ ${(r:1:)${(%):-%~}} != "/" && ${#${(%):-%~}} -gt 1 ]] && echo /)
$([[ ${#${(@s:/:)${(%):-%~/a}[2,-1]}} -gt 3 ]] && echo ${(j:/:)${(
@r:1:)${(s:/:)${(%):-%~}[2,-1]}[(ws:/:)1,(ws:/:)-3]}}/)$([[ ${#${(
@s:/:)${(%):-%~/a}[2,-1]}} -gt 2 ]] && echo ${(@j:/:)${(s:/:)${(%)
:-%~}[2,-1]}[-2,-1]})$([[ ${#${(@s:/:)${(%):-%~/a}[2,-1]}} -eq 2 ]
] && echo ${${(s:/:)${(%):-%~}}[-1]})%{%(!.$fg[yellow].$fg[red])%}
%(?..%B(%?%)%b)%{$fg[default]%}$ '
export RPROMPT="[%{$fg[yellow]%}%*%{$fg[default]%} on %{$fg[yellow
]%}%D%{$fg[default]%}]"
(Remove the line breaks)

Basically, if you put this code in your .zshrc you get a prompt like this (although with better colours):

~/Anthony/Work$ cd imperial/261\ Lab\ 2                                  [18:31:58 on 10-01-30]
~/A/W/imperial/261 Lab 2$ perl -e 'exit 63'                              [18:32:09 on 10-01-30]
~/A/W/imperial/261 Lab 2(63)$                                            [18:33:10 on 10-01-30]

On the left is the current working directory, the exit code of the last command (if not 0), and a $
On the right is the time and date.

The path on the left side abbreviates all but the last two path elements to a single letter (and a lot of esoteric code was required to get this working).

Note: to do this, you need a few extra lines in your .zshrc to set things up:

autoload colors zsh/terminfo
colors
setopt prompt_subst

Have fun editing your prompt line!

autoload colors zsh/terminfo
colors
setopt autoload colors zsh/terminfo
colors
setopt prompt_subst

Desert Bus

I’ve been watching Desert Bus for Hope for the last day or so. It’s a charity marathon to raise money for Child’s Play, which gives games to children’s hospitals. But the main thing is that it’s run by the great sketch comedians at LoadingReadyRun.com. Basically they’re playing the most boring game in existence – Desert Bus – for as long as the donations keep coming in.

So far they’ve raised over $25000, and they beat what they raised in their first year within 24 hours of starting. Because of the amount they’ve raised so far, they’ll be going for at least 110 hours altogether. You* can prolong their suffering by visiting desertbus.org and donating.

Now I’m going to go and get some sleep before getting up and watching more of their live feeds. Maybe I’ll even get around to finishing some of my coursework before it’s due in on Monday.

* Chances are you don’t exist, as I haven’t really told anyone the address of this blog yet.

Update: They finished after 5 and 2/3 days, and raised over $137,000

The first post

Well this is the second time I’ve installed and configured this blog. I forgot to copy the MySQL database across when I switched my laptop-server from kubuntu to arch linux. There shouldn’t be any reason for it to disappear again though.

I think I’ve got everything working again now, and it should be easy enough to rewrite the one real blog post I had on here originally. I’ve even got the about page done already. Now, however, it is time for sleep; followed by lectures in 8 hours time.