Lab 8: Writing a SimpleC Compiler
The tests that you will use to test your compiler are in tests.tar.
Copy tests.tar into your lab8-src directory and type "tar -xvf
tests.tar".
To run the tests cd into "tests" and type "testall".
Goal
In this lab you will write a compiler for the Simple C language.Your
compiler will generate x86-64 assembly language that can be assembled
to produce an executable file.
Introduction
You will write a compiler for the SimpleC language. The SimpleC
language is a subset of the C language that only has the following
types:
SimpleC Type
|
long
|
long*
|
char*
|
char**
|
void
|
Examples of the SimpleC language is the following:
------------ test1.c ---------------
void main()
{
printf("Hello world...\n");
}
------------ test2.c ---------------
long fact(long n) {
if (n==0) return 1;
return n*fact(n-1);
}
void main()
{
printf("Factorial of 5 = %d\n" , fact(5));
}
--------------- test3.c -------------
long sum(long n, long* a) {
long i, s;
s = 0;
for (i=0; i<n; i=i+1) {
s = s + a[i];
}
return s;
}
long main()
{
long* a;
a = malloc(5*8);
a[0]=4; a[1]=3; a[2]=1; a[3]=7; a[4]= 6;
long s;
s = sum(5, a);
printf("sum=%d\n", s);
}
---------------------------------
Language Description
The language is very similar to "C" but only uses long, long*, char,
char* and void as types. NOTICE: There is no space betwen "long" or
"char" and "*" .
Arithmetic Operators: +, -, *, /, %,
Logical Operators: &&, ||
Equality Operators: ==, !=
Relational Operators: <, <=, >, >=
Flow control: if, if.else, while, for, do
Downloading the Starting Sources
You can dwnload the sources from lab8-src.tar.Z.
Uncompress them and
untar them.
uncompress lab8-src.tar.Z
tar -xvf lab8-src.tar
Now you are ready to compile your first program. Type:
cd lab8-src
make
./scc test1.c
cat test1.c
cat test1.s
The commans scc compiles the
program and generates the assembly file test1.s
Currently the scc compiler has just the bare bones of the compiler you
will write. The main implementation of the compiler is in simple.y.
The file simple.y is a "yacc" file that describes the grammar of the
language and it has some "actions" or "C" code that are executed when
some parts of the grammar are reached.
For example, the entry in siple.y:
function:
var_type WORD
{
fprintf(fasm, "\t.text\n");
fprintf(fasm, ".globl %s\n", $2);
fprintf(fasm, "%s:\n", $2);
}
LPARENT
arguments RPARENT compound_statement
{
fprintf(fasm, "\tret\n", $2);
}
;
This entry is reached when a function definition is parsed. var_type is the type of the
function: long, char*, void etc. WORD is the name of the function and
it can be accessed using the variable $2, that means the 2nd element in
the defintion. The code between {...} is executed after the function
name is parsed and before the "(". The code generates the preamble of
the function that includes .text,
.global and the name:.
After the whole function is parsed, the epiloge with the ret instruction is generated.
For more information about yacc see Yacc and
Lex tutorial. and Lex
and Yacc manuals.
Part 1 Generating code for expressions
The first step will consist in generating code for expressions. The
expressions contain all the operands Arithmetic Operators: +, -, *, /,
%,
Logical Operators: &&, || Equality Operators: ==, != Relational
Operators: <, <=, >, >=.
Since the code generation is intended to be done in a single pass, it
will be easier to generate code that simulates a stack machine.
However, instead of using memory all the time for the stack, we will be
using some of the registers. In this way, registers and not memory will
be used in most of the cases. This registers will be saved when
entering a function and restored before leaving a function. When more
than 5 entries in the stack are needed then the stack can be exteneded
to the execution stack.
Stack Position
|
Register/Memory
|
0
|
rbx
|
1
|
r10
|
2
|
r13
|
3
|
r14
|
4
|
r15
|
>=5
|
Use the execution stack
|
Example:
x = y
+ 5*(z-1)
Using a stack machine:
push y
push 5
push z
push 1
-
*
+
Using the equivalent register positions:
mov y,
%rbx # push y
mov $5, % r10 # push 5
mov z,
%r13 # push
z
mov
$1,%r14 # push 1
subq %r14, %r13 # -
imulq %r13, %r10 #*
addq %r10,%rbx
# +
Add the code in simple.y that will generate the assembly code described
above for all the epressions, Arithmetic Operators: +, -, *, /, %,
Logical Operators: &&, || Equality Operators: ==, != Relational
Operators: <, <=, >, >=, variables and integer constants.
After completing this part you should be able to assemble and run
test2.c
Turnin
Follow these instructions to turnin lab8 part 1:
- Make sure that your programs are built by typing "make". Make
sure it builds and runs in one of the machines sslab01.cs,
sslab02.cs etc.
- Type "turnin -c cs250 -p lab8-1 lab8-src"
- Type "turnin -c cs250 -p lab8-1 -v" to make sure you have
submitted the right files
The deadline for this part is Monday November 18th 2013 at 11:59pm.
Part 2 Generating code for control flow statements
Inn this part you will generate the code for if-else/while/do-while/for
etc. Alsoyou will complete the code for array derreference. After you
finish this part of the project you should be able to generate assemble
and run test3.c and test4.c. Also write your own tests.
Turnin
Follow these instructions to turnin lab8 part 2:
- Make sure that your programs are built by typing "make". Make
sure it builds and runs in one of the machines sslab01.cs,
sslab02.cs etc.
- Type "turnin -c cs250 -p lab8-2 lab8-src"
- Type "turnin -c cs250 -p lab8-2 -v" to make sure you have
submitted the right files
The deadline for this part is Monday December 2nd at 11:59pm.