Both assignments are mandatory. The first is on parsing, and the second on classes, objects, and inheritance.
The first assignment is on tools for lexical analysis and parsing. The Asterix compiler from which we start, is implemented using the Flex (Lex-like) lexical analyzer generator and the Bison (YACC-like) LALR parser generator. Once they were very innovative tools, but nowadays there are much more modern parser generating tools, which are easier to use, faster, and much better at error correction.
In this assignment the Bison implementation has to be replaced by an LLnextgen implementation. LLnextgen is a rewrite of LLgen, hence the name, and provides better development support (error reporting) than the original LLgen implementation, which is a modern LL(1) parser generator, with static and dynamic facilities to resolve non-LL(1) ambiguities. The behavior of the new compiler must be the same as the old compiler for correct input. For incorrect input, the compiler with the LLnextgen parser must have better support for error reporting and error correction.
The fact that the original parser generator is LALR and the new one is LL(1) has a number of implications:
The sources for the Asterix compiler can be found in /usr/local/asterix-compiler/src. You should create your own source directory and copy all files in /usr/local/asterix-compiler/src to that directory.
Tips:
[Fix the bugs, if any, discovered by your instructor during the grading of assignment 1.]
In this assignment, you are to extend the Asterix language with objects and classes, inheritance, and dynamic binding. A complete description of the changes of the language specification is given below. You must modify your LLnextgen-based compiler such that it implements the extended language.
declaration |
class_declaration |
In addition to var declarations and subprogram declarations, we introduce class declarations.
class_declaration |
"class" identifier![]() ![]() |
This rule declares a new class with the name identifier
in the global scope.
It is an error to declare a class other than at the global level.
identifier
may not be re-declared in the global scope.
Optionally the class may inherit from one superclass
identifier.
identifier
must be declared as a class before
identifier
.
Each class introduces a new scope for its member declarations. These can be either var declarations or subprogram declarations (methods). If the class does not inherit from another class, the immediate parent scope of the member scope is the global scope. If the class does inherit from another class, the immediate parent scope of the class is the member scope of the class from which it inherits (see Fig. 1).
A method in a superclass can be overridden by a method in a subclass with the same name. Method overriding is only allowed if the types of the arguments are exactly the same, including the way arguments are passed (by value or by reference). Furthermore, both methods must either be a function returning the same type, or both be a procedure. Member variables cannot be overridden, however they can be redeclared in a subclass as a different variable.
type |
identifier |
A new class type is introduced. identifier must be declared by a preceding, or the current, class declaration. A variable of this type is a reference to an object of (a subclass of) class identifier.
There is a distinction between a static type and a dynamic type. The static type of a variable is the type used in its declaration. The dynamic type of a variable is the type used in the "new" expression.
There is also a distinction between assignment compatibility and being of exactly the same types. Two expressions are assignment compatible if the static type of the lvalue is (a superclass (recursively)) of the static type of the rvalue. Assignment compatibility is used for the assignment statement, call-by-value parameter passing, and function value returning. For call-by-reference parameter passing, the types must be exactly the same (as well as for arrays of objects.)
expression |
"new" identifier |
This expression creates an object of static and dynamic type identifier. identifier must be declared by a class declaration. If there is no heap space for a new object, a runtime error occurs.
statement |
"delete" expression ";" |
The semantics of a delete-statement has been redefined as deleting a class object or an array object. The reference must have been obtained by a new expression, although deleting a "null" value is allowed. Runtime behavior is undefined if an object is deleted more than once.
expression |
"self" |
"self" is a reference to the object currently being invoked. It may not be used outside a method. Its static type is a reference to the class that contains the method. Its dynamic type is determined at object creation time. It is not an lvalue.
expression |
"null" |
The meaning of the "null" expression has been redefined. "null" denotes a null reference to either an array of any type or any class. It is not an lvalue.
expression |
expression![]() |
qualident |
[ identifier![]() ![]() |
The dot-operator is used to select a member of an object.
The precedence level and associativity are the same as for the index operator
"[]" and function call "()".
expression must be of any class type.
The program behavior is undefined if expression
equals
"null" or has been deleted.
If identifier is specified, it must be the same class as
the static type of expression
, or one of its superclasses
(recursively).
The qualification operator "::" binds stronger than the dot-operator.
Name resolution for identifier
starts in
identifier
, or in the static type of
expression
if identifier
is not specified.
If identifier
is not declared in that class, and the class
has a superclass, identifier
is searched for in its
superclass (recursively).
It is a compile-time error if identifier
is not declared
in any of its (super)classes.
Binding is dynamic if identifier denotes a method and
is not qualified with identifier
.
If identifier
denotes a member variable, or is qualified
with identifier
, binding is static.
The difference between static and dynamic binding has consequences for
the method being invoked.
If B is a subclass of A and overrides method m,
object obj has static type A and dynamic type B,
then if obj.m() is specified, the method defined in class B is
invoked;
if obj.A::m() is specified, the method defined in class A is
invoked.
For convenience the following short-hand notation may be used within the context of a method:
expression |
qualident |
it is semantically equivalent to "self""."qualident.
An example program is shown below.
(* Test function overriding *) include io.inc class A is function f() : int is return 1; end class B inherits A is function f() : int is return 2; end class C inherits B is function f() : int is return 3; end function main(argv : array of string) : int is var obj : B; begin obj := new C; WriteString("Expect 3 : "); WriteInt(obj.f()); (* Note: obj.C::f() is not allowed *) WriteLine(); WriteString("Expect 2 : "); WriteInt(obj.B::f()); WriteLine(); WriteString("Expect 1 : "); WriteInt(obj.A::f()); WriteLine(); delete obj; return 0; end
K.G. Langendoen 2006-12-12