LALR(1) Grammar

This language specification provides an LALR(1) grammar for all of its syntax. The format used for it is very basic, and only consists of a single rule for each non-terminal, which contains a | separated list of productions which reduce to that rule. For example:

QName = Name | QName . Name

This says that a QName (qualified name) can either be a single name or a qualified name followed by a dot and another name.

Rules can also contain epsilon productions, which do not contain any other tokens. These are represented by giving an epsilon symbol (ε) as one of the productions.

The language’s terminals consist of keywords, symbols, and literals.

In this specification, keywords begin with lower-case letters, and literals and non-terminals begin with upper-case letters. Symbols are usually shown as the list of characters that represents them; however, the = and | symbols are represented by Equals and Pipe respectively.

Literals

The full list of literals is:

  • Name
  • StringLiteral
  • IntegerLiteral
  • FloatingLiteral
  • SinceSpecifier

Each of the grammar’s terminals is parsed by the tokeniser. Keywords are single words in the list of keywords below, and are case-sensitive. Names are any words not in the list of keywords.

Names

Names are sequences of upper- and lower-case characters, underscores, and decimal digits. Names can begin with letters and underscores, but not digits.

If a sequence of characters matches a keyword exactly, it is considered a keyword instead of a name.

String Literals

String literals consist of a double-quote character, a sequence of characters (which cannot include double-quote characters or line feeds) and escape sequences (which begin with a backslash) , and finally another double-quote character.

An escape sequence can be one of the following:

\\ a backslash
\b a backspace
\t a tab
\n a line feed
\f a form feed
\r a carriage return
\" a double quote
\' a single quote

Alternatively, it can be a multi-character sequence which forms a character code in either octal or hexadecimal:

  • An octal character code begins with a backslash and then consists of one to three octal characters.
  • A hexadecimal character code begins with a backslash followed by a u and then consists of exactly four hexadecimal characters.

Integer Literals

Floating Literals

Since Specifiers

Keywords

The full list of keywords is:

abstract
boolean
break
byte
cast
catch
class
compound
constructor
continue
double
else
extends
false
final
finally
float
for
getter
if
immutable
implements
import
instanceof
int
interface
long
mutable
native
new
null
object
package
property
return
selfish
setter
short
since
static
super
this
throw
throws
true
try
ubyte
uint
ulong
unbacked
unchecked
ushort
void
while

Symbols

The full list of symbols is:

& ampersand
&= ampersand equals
-> arrow
^ caret
^= caret equals
: colon
, comma
&& double ampersand
:: double colon
== double equals
<< double langle
<<= double langle equals
-- double minus
%% double percent
%%= double percent equals
|| double pipe
++ double plus
>> double rangle
>>= double rangle equals
. dot
= equals
! exclaimation mark
!= exclaimation mark equals
/ forward slash
/= forward slash equals
# hash
< langle
<= langle equals
{ lbrace
( lparen
[ lsquare
- minus
-= minus equals
% percent
%= percent equals
| pipe
|= pipe equals
+ plus
+= plus equals
? question mark
?: question mark colon
?. question mark dot
> rangle
>= rangle equals
} rbrace
) rparen
] rsquare
; semicolon
* star
*= star equals
~ tilde
_ underscore

Table Of Contents

Previous topic

Language Specification

Next topic

Source Files

This Page