You will write a small compiler that will read an input program and represent

Be Prepared For The Toughest Questions

Practice Problems

Abstract

The goal of this project is to give you some hands-on experience with implementing a small compiler. You will write a compiler for a simple language. You will not be generating assembly code. Instead, you will generate an intermediate representation (a data structure that represents the program). The execution of the program will be done after compilation by interpreting the generated intermediate representation.

1 Introduction

You will write a small compiler that will read an input program and represent in a linked list as a sequence of instruction. A node of the linked list represents one instruction. A instruction node specifies: (1) the type of the instructions, (2) the operand(s) of the instruction (if any) and, for jump instructions, the next instruction to be executed (the default is that the next instruction in the list is executed). After the list of instructions is generated by your compiler, your compiler will execute the generated list of instructions by interpreting it. This means that the program will traverse the data structure and at every node it visits, it will “execute” the node by changing the content of memory locations corresponding to operands and deciding what is the next instruction to execute (program counter). The output of your compiler is the output that the input program should produce. These steps are illustrated in the following figure

The remainder of this document is organized into the following sections:

1. Grammar Defines the programming language syntax including grammar.

2. Execution Semantics Describe statement semantics for assignment, input, if, while, switch, for and output statements.

3. How to generate the linked list of instructions Explains step by step how to generate the intermediate representation (data structure).

4. Requirements Lists other requirements.

2 Grammar

The grammar for this project is the following:

program ! var section body inputs

var section ! id list SEMICOLON

id list ! ID COMMA id list j ID

body ! LBRACE stmt list RBRACE

stmt list ! stmt stmt list j stmt

stmt ! assign stmt j while stmt j if stmt j switch stmt j for stmt

stmt ! output stmt j input stmt

assign stmt ! ID EQUAL primary SEMICOLON

assign stmt ! ID EQUAL expr SEMICOLON

expr ! primary op primary

primary ! ID j NUM

op ! PLUS j MINUS j MULT j DIV

output stmt ! output ID SEMICOLON

input stmt ! input ID SEMICOLON

while stmt ! WHILE condition body

if stmt ! IF condition body

condition ! primary relop primary

relop ! GREATER j LESS j NOTEQUAL

switch stmt ! SWITCH ID LBRACE case list RBRACE

switch stmt ! SWITCH ID LBRACE case list default case RBRACE

for stmt ! FOR LPAREN assign stmt condition SEMICOLON assign stmt RPAREN body

case list ! case case list j case

case ! CASE NUM COLON body

default case ! DEFAULT COLON body

inputs ! num list

num list ! NUM

num list ! NUM num list

Some highlights of the grammar:

1. Division is integer division and the result of the division of two integers is an integer.

2. Note that if stmt does not have else.

3. Note that for has a very general syntax similar to that of the for loop in the C language

4. Note that the input and output keywords are lowercase, but other keywords are all upper- case.

5. condition has no parentheses.

6. There is no type specified for variables. All variables are int by default.

3 Variables and Locations

The var section contains a list of all variable names that can be used by the program. For each variable name, we associate a unique locations that will hold the value of the variable. This association between a variable name and its location is assumed to be implemented with a function location that takes a variable name (string) as input and returns an integer value. The locations where variables will be stored is called mem which is an array of integers. Each variable in the program should have a unique entry (index) in the mem array. This association between variable names and locations can be implemented with a location table.

4 Inputs

The list of input values is called inputs and appears as the last section of an input program. This list must be read by your compiler and stored in an inputs array, which is simply a vector of integers.

5 Execution Semantics

All statements in a statement list are executed sequentially according to the order in which they appear. Exception is made for body of if stmt, while stmt and switch stmt as explained below. In what follows, I will assume that all values of variables as well as constants are stored in locations. This assumption is used by the execution procedure that we provide. This is not a restrictive assumption. For variables, you will have locations associated with them. For constants, you can reserve a location in which you store the constant (this is like having an unnamed immutable variable).

5.0.1 Input statements

Input statements get their input from the sequence of inputs. We refer to i’th value that appears in inputs as i’th input. The execution of the i’th input statement in the program of the form input a is equivalent to:

mem[location("a")] = inputs[input_index] input_index = input_index + 1

where location("a") is an integer index value that is calculated at compile time as we have seen above. Note that the execution of an input statement advances an input index which keeps track (at runtime) of the next value to read (like in project 1).

5.1 Output statement

The statement

output a;

prints the value of variable a at the time of the execution of the output statement.

5.2 Assignment Statement

To execute an assignment statement, the expression on the righthand side of the equal sign is evaluated and the result is stored in the location associated with the lefthand side of the expression.

5.3 Expression

To evaluate an expression, the values in the locations associated with the two operands are obtained and the expression operator is applied to these values resulting in a value for the expression.

5.2 Assignment Statement

To execute an assignment statement, the expression on the righthand side of the equal sign is evaluated and the result is stored in the location associated with the lefthand side of the expression.

Details

Hint

ComputerProgramming is the process of creating a set of instructions that tell a computer how to perform a task. The purpose of programming is to find a sequence of instructions that will automate the performance of a task (which can be as complex as an operating system) on a computer, often for solving a given problem....

Select Deadline for Completion

4 Days

3 Days

2 Days

1 Day

1 to 15 Hours

Know the process

Students succeed in their courses by connecting and communicating with
an expert until they receive help on their questions

Unable to find what you’re looking for?

Consult our trusted tutors.

Ask a Question

Be Prepared For The Toughest Questions

Practice Problems

Know the process

Submit Question

Tutor Is Assigned

Receive Help

Unable to find what you’re looking for?