Syntax

Care should be taken, when using colons and semicolons in the same sentence, that the reader understands how far the force of each sign carries.
—Robert Graves and Alan Hodge

Keywords

break   else      if       override   struct   var
class   extends   loop     proc       type     while
const   for       method   return     until

Reserved identifiers

bool    int      nil      false   new    true
Null    Refany   Root     Text
First   Last     Number

Operators

+        -         <        >        {        }        =
*        /         <=       >=       (        )        ==
||       &&        .        ..       [        ]        !=
^        %         ,        !        :        ;        :=

Comments

A comment is an arbitrary character sequence opened by /* and closed by */. Comments can be nested and can extend over more than one line.

Conventions for syntax

We use the following notation for defining syntax:

    X Y      X followed by Y
    X|Y      X or Y.  
    [X]      X or empty
    {X}      A possibly empty sequence of X's 
    X&Y      X or Y or X Y
"Followed by" has greater binding power than | or &; parentheses are used to override this precedence rule. Non-terminals begin with an upper-case letter. Terminals are either keywords or quoted operators. The symbols Id, Number, TextLiteral, and CharLiteral are defined in the token grammar. Each production is terminated by a period.

Compilation unit productions

Compilation = {Decl} [ Block ].

Block = "{" { Decl } { Stmt } "}"
Decl = const ConstDecl ";"
     | type TypeDecl ";"
     | var VariableDecl ";"
     | proc Id Signature ( Block | ";" ).

ConstDecl      = Id [":" Type] "=" ConstExpr.
TypeDecl       = Id "=" Type.
VariableDecl   = IdList (":" Type & ":=" Expr).

Signature      = "(" Formals ")" [":" Type].
Formals        = [ Formal {";" Formal} [";"] ].
Formal         = [var] IdList ":" Type.

Statement productions

Stmt = AssignSt | Block | BreakSt | CallSt | ForSt | IfSt | LoopSt | ReturnSt.

AssignSt = Expr ":=" Expr ";".
CallSt   = Expr "(" [Actual {"," Actual}] ")" ";".
BreakSt  = break ";".
ForSt    = for Id ":=" Expr ".." Expr Block.
IfSt     = if Expr Block [ else ( IfSt | Block ) ].
LoopSt   = loop [ while Expr ] Block [ until Expr ";" ].
ReturnSt = return [Expr] ";".

Actual   = Type | Expr .

Type productions

Type = TypeName | ArrayType | RecordType | RefType | ObjectType.

ArrayType     = "[" [ Expr ] "]" Type.
RecordType    = struct "{" Fields "}".
RefType       = "^" Type.
ObjectType    = class [ extends Type ] "{" Members "}".

Fields    = [ Field {";" Field} [ ";" ] ].
Field     = IdList ":" Type.
Members   = [ Member {";" Member} [ ";" ] ].
Member    = Field | Method | Override.
Method    = Id Signature [":=" ConstExpr].
Override  = Id ":=" ConstExpr .

Expression productions

ConstExpr = Expr.

Expr = E1 {"||" E1}.
  E1 = E2 {"&&" E2}.
  E2 = {"!"} E3.
  E3 = E4 {Relop E4}.
  E4 = E5 {Addop E5}.
  E5 = E6 {Mulop E6}.
  E6 = {"+" | "-"} E7.
  E7 = E8 {Selector}.
  E8 = Id | Number | CharLiteral | TextLiteral | "(" Expr ")".

Relop =  "==" | "!=" | "<"  | "<=" | ">" | ">=".
Addop =  "+" | "-".
Mulop =  "*" | "/" | "%".

Selector = "^" | "." Id  |  "[" Expr "]" | "(" [ Actual {"," Actual} ] ")".

Miscellaneous productions

IdList      =  Id {"," Id}.
TypeName    =  Id.

Token productions

To read a token, first skip all blanks, tabs, newlines, carriage returns, vertical tabs, form feeds, comments, and pragmas. Then read the longest sequence of characters that forms an operator or an Id or Literal.

An Id is a case-significant sequence of letters, digits, and underscores that begins with a letter. An Id is a keyword if it appears in the list of keywords, a reserved identifier if it appears in the list of reserved identifiers, and an ordinary identifier otherwise.

In the following grammar, terminals are characters surrounded by double-quotes and the terminal "\"" represents double-quote itself.

Id = Letter {Letter | Digit | "_"}.

Literal = Number | CharLiteral | TextLiteral.

CharLiteral = "'"  (PrintingChar | Escape | "\"") "'".

TextLiteral = "\"" {PrintingChar | Escape | "'"} "\"".

Escape = "\" "a"   | "\" "b"   | "\" "f"   | "\" "n"   | "\" "r"
       | "\" "t"   | "\" "v"   | "\" "\"   | "\" "'"   | "\" "\""
       | "\" ( "0" | "1" | "2" | "3" ) OctalDigit OctalDigit
       | "\x" HexDigit HexDigit
       | "\u "HexDigit HexDigit HexDigit HexDigit
       | "\U" HexDigit HexDigit HexDigit HexDigit
              HexDigit HexDigit HexDigit HexDigit

Number = Digit {Digit}
       | Digit {Digit} "_" HexDigit {HexDigit}.

PrintingChar = Letter | Digit | OtherChar.

Digit = "0" | "1" | ... | "9".

OctalDigit = "0" | "1" | ... | "7".

HexDigit = Digit | "A" | "B" | "C" | "D" | "E" | "F"
                 | "a" | "b" | "c" | "d" | "e" | "f".

Letter = "A"  | "B"  | ... | "Z"  | "a"  | "b"  | ... | "z".

OtherChar = " " | "!" | "#" | "$" | "%" | "&" | "(" | ")"
          | "*" | "+" | "," | "-" | "." | "/" | ":" | ";"
          | "<" | "=" | ">" | "?" | "@" | "[" | "]" | "^"
          | "_" | "`" | "{" | "|" | "}" | "~"