In this project, you will be filling in the important parts of the code generation pass of the Truss compiler.

You do not have to have completed projects 1, 2, or 3 successfully to do this project. But if you do, once you finish this project, you can combine them all and have a fully-functional compiler, which is pretty cool!

Quick start guide:

  1. You should have received a GitHub invitation link in your email. Click it!
    • Now you should have access to your repo.
  2. git clone it to your computer.
  3. Edit the codegen pass source as detailed below so that all the test programs compile and run.
  4. As you work, make use of git commits to keep track of your progress.

    You can also use git push origin main at any time to ensure that there’s a safe copy of your project on the GitHub servers, or to transfer your code between computers (by pushing it from one and pulling it on the other).

  5. When you’re done (or when you’re not done but it’s due):
    1. Update the README.txt to include your name and username and any information you think will be useful to the grader.
    2. Also update this line in Cargo.toml:
      • authors = ["YOUR NAME <YOUR PITT EMAIL>"]
    3. Commit those changes, and…
    4. git push origin main to submit it.
      • the last commit before the late due date is what we will grade.

Submission and Grading

Project submissions will only be accepted through GitHub. If you are having trouble using git, please get in touch sooner rather than at 11:58 PM on the due date.

The project is due at midnight on the due date, but there is some wiggle room there. (We can see the time that you pushed to GitHub.)

You can turn it in for late credit until midnight on the day after the due date. You’ll get 10% off your grade.

You can absolutely get partial credit for each category, depending on how much code you’ve written, how serious the mistake is, etc.

Grading rubric (based on which programs compile and run correctly):

These programs do not count towards or against your grade, they are just for testing:

A different workflow

This project will work a little differently than the first three. Instead of cargo test, you will be running a program that outputs MIPS assembly, and then optionally runs the assembly program using MARS. So, it’s going to be up to you to verify that your output is “correct” by seeing if the resulting programs run correctly.

Why no cargo tests? For codegen, there are many ways in which your output might subtly differ from mine, but still be correct. Choices of registers, label names, extraneous instructions etc. make it impossible to have one “correct” output against which yours is judged. Also, something something halting problem.

What you’ve been given

Although you are not required to have completed the previous projects, there are some test programs that you can make use of by copying these files over from those projects into this one (but do not copy any other files as that may cause things to break):

Most of the source code is familiar to you by now, but:

Getting up to speed

After doing the typechecking in project 3, the compiler now has three data structures:

The last pass is codegen. This is defined in src/cg/mips/, called DirectCg. It takes these three data structures, and uses them to produce the MIPS assembly version of the program.

Internally, DirectCg uses CgCtx as self.ctx. CgCtx bundles NameCtx and TypeCtx together and provides some convenience methods for common operations that would be awkward to do on the bare NameCtx and TypeCtx objects.

A lot of code is already written in, but it’s mostly support code:

The test programs

The test programs are located in programs/. The .trs files are the actual Truss source code. (You can open them in your editor and use Rust syntax highlighting to read them.)

But because I can’t assume you’ve done projects 1 through 3, I’ve also had to include a second file for each program. the .ron files are the frozen versions of each program. These files were produced with src/bin/

Each .ron file is the Program, NameCtx, and TypeCtx data structures that my compiler produced, saved (“frozen”) to a file. If you open a .ron file in your code editor, you’ll see it’s just some text.

When you compile a .ron file, it “unfreezes” those data structures and passes them to DirectCg::run. That way, your codegen can be tested without your lexing/parsing/typechecking working at all.

Do not modify any of the .trs or .ron files while working on the project. If you do accidentally modify them, you can use git to undo any changes. Ask for help with that if you don’t know how.

Getting Started

When you run the compiler on the command line, be sure you are in the repository’s root directory. That is, when you ls (or dir) you should see programs, src, etc. Otherwise, the file paths will be wrong and it won’t work.

Open up programs/01_empty.trs in your editor. It just contains:

fn main() {}

Not very exciting, but now you can try compiling and running the .ron version:

cargo run --bin compile -- -r programs/01_empty.ron

After it compiles the compiler, it should show some output that looks like:

    Finished dev [unoptimized + debuginfo] target(s) in 0.47s
     Running `target/debug/compile -r programs/01_empty.ron`
------------ running MARS, all following output is from program ------------

Wow, it’s nothing! But no news is good news. It’s an empty program, so it should do nothing.

If MARS does not run successfully, please contact me ASAP. You literally cannot do this project without it.

Last, you can look at the output of the compiler. You should now see programs/01_empty.asm has appeared. If you open it in your editor, you’ll see the generated code for main at the top:

    # Stack Frame:
    #   8 bytes total, 8 of which are variables/saved registers
    sw   fp, -4(sp)
    sw   ra, -8(sp)
    move fp, sp
    addi sp, sp, -8
    # ------------------------------------------------
    # ------------------------------------------------
    lw   ra, -8(fp)
    lw   fp, -4(fp)
    addi sp, sp, 8
    li   v0, 10

Right after main: are some comments describing the stack frame (arguments and locals) and the function prologue. The two # ------ lines surround the function’s code (which is empty here of course). And then there is _return: and the function epilogue.

(Below main is the standard/runtime library source code. This is present in every program the compiler outputs, so you can ignore it.)

When working on a test, you can keep its .asm file open in your editor, and your editor will reload it when it changes. That way you can diagnose problems with your implementation by looking at the output assembly after running the compiler.

Your task

You will be implementing the visit_stmt and visit_exp_into_reg methods towards the bottom of These do the real work of generating the code inside the body of a function.

If you look at those methods, most of the cases in their matches look something like:

let _ = (decl, );
unimplemented!("let statement");

You will delete both of these lines of code whenever you implement a case. The let _ = line is just to make the compiler shut up about unused variables, and the unimplemented!() crashes the program when executed, telling you what’s unimplemented. For example, try this:

cargo run --bin compile -- -r programs/02_hello_int.ron

You should get an error like:

thread 'main' panicked at 'not implemented: exp statement', src/cg/mips/

If we look at the source for 02_hello_int.trs, we see:

fn main() {
    // if this were a calculator, it'd look like hello...

We have some expression statements inside main, which is why our codegen crashed: we haven’t implemented those yet.

So, let’s get started with that!

Expression statements (02_hello_int)

In, have a look at how visit_stmt is declared:

fn visit_stmt(&mut self, s: &Box<Stmt>, b: &mut CgFuncBuilder)

b is important: you call methods on it to generate instructions, and you will pass it to the various visit_exp_* methods as well. You can think of it as the “notepad” that we are writing down all the instructions onto.

Now, let’s implement the Exp(exp) => case of visit_stmt.

  1. As I said before, remove the let _ = and unimplemented! lines of that case.
  2. Replace them with this:
     self.visit_exp_no_result(exp, b);

Try compiling again. It still crashes, but in a New and Exciting way, complaining about “direct calls”!

thread 'main' panicked at 'not implemented: direct calls', src/cg/mips/

Implementing direct calls

If you go to the very bottom of the file, you’ll see a method called direct_call. This is called by visit_exp_into_reg when it encounters a “direct” function call (that is, when you call a function using its name, instead of through a function pointer variable). Its arguments are:

Fortunately this will be easy to implement:

  1. use the self.visit_args helper method to visit the args.
    • look it up in this same file.
    • The first argument is an Option for this which you can pass None for.
  2. use b.jal(name); to generate a jal instruction to name.
    • it’s that easy!

Still no luck, but we’re almost there:

thread 'main' panicked at 'not implemented: int literal expression', src/cg/mips/

Implementing int literals

As the panic message implied, the last thing we need to implement is the int literal expressions.

In visit_exp_into_reg’s IntLit(val) => case:

  1. once again delete the let _ = and unimplemented!
  2. and replace them with:
     b.load_immediate(dst, *val as isize);

b.load_immediate generates an li MIPS instruction. It takes the destination register (which is one of visit_exp_into_reg’s arguments) and the value to be loaded. We have to cast it to isize for it to work.

Now if you compile, the program should run!

------------ running MARS, all following output is from program ------------

Wooahhhhh!!!! It’s actually running the code!

You can verify that this is a real MIPS program by looking at 02_hello_int.asm. The body of main (without the prologue/epilogue) now looks like:

    # ------------------------------------------------
    addi sp, sp, -4
    li   s0, 0
    sw   s0, 0(sp)
    jal  print_i
    addi sp, sp, -4
    li   s0, 7734
    sw   s0, 0(sp)
    jal  println_i
    # ------------------------------------------------

b.load_immediate generated the li instructions, and b.jal generated the jals (duh). The addis and sws were generated by visit_args.

String literals (03_hello)

Now that we got expression statements and direct calls done, the next few things will be easy. 03_hello.trs is a “hello, world” program. But compiling 03_hello.ron panics saying “string literal expression” is unimplemented. Ok, let’s do that.

Strings can’t fit into registers of course. So, there are two steps to implementing the StrLit case in visit_exp_into_reg:

  1. use to add the string literal’s data to the .data segment.
    • this returns the label that is generated for the string literal. put it in a variable.
  2. use b.load_address(dst, &label); to generate a la instruction.
    • label is the variable from step 1.

Now 03_hello.ron should compile and run successfully. Check out 03_hello.asm. At the top:

STRLIT$0: .asciiz "hello, world!"

There’s the string literal. And in main:

    addi sp, sp, -4
    la   s0, STRLIT$0
    sw   s0, 0(sp)
    jal  println_s

There’s the la that puts the address of the string literal into the register.

Sidenote: using the cg_interactive program

This is optional. You don’t have to use cg_interactive. It’s just another way to test.

If you copied your lexer, parser, and typechecking from your other projects into this one, you can use the cg_interactive program like this:

cargo run --bin cg_interactive

Like the other _interactive programs, it lets you type in some code and immediately shows you the output. This time, the output is MIPS.

It starts in “expression mode” so you can type in an expression and it will evaluate that expression into register s0. You can switch modes with @e, @s, and @d (the message it prints when it starts explains this).

Try entering a single int literal:

exp > 10


    li   s0, 10


So it lexed, parsed, name-checked, type-checked, and then code-generated “10”, which ended up as an instruction li s0, 10.

The .data segment is there for globals and string constants. If you enter a string constant, you can see it there:

exp > "hello"

STRLIT$0: .asciiz "hello"

    la   s0, STRLIT$0


Using cg_interactive might be an easier way to test things if your code has bugs and you’re trying to narrow down where the problem is. It’s also just fun/enlightening. I think so anyway.

let statements and identifier expressions (04_let)

Using variables is kind of important. 04_let.ron panics about let statement being unimplemented because the code looks like:

fn main() {
    let msg = "this is a variable.";

For the Let case in visit_stmt, just use this:

self.visit_local_var_decl(&var, b);

I know, not too exciting, but check out how visit_local_var_decl is implemented. It:

With that implemented, now we have a panic about identifier expression, so let’s implement that.

In the Id(ident) case in visit_exp_into_reg:

  1. use self.get_sym_var_loc (which wraps b.get_sym_var_loc with some extra logic) to get the location of the ident.
  2. use self.loc_into_reg to perform a load.
    • load the location you got from step 1 into the dst register.

You can see that these two steps are similar to the last two steps of visit_local_var_decl, except we’re loading instead of storing, and may be loading from a global instead of a local.

With that implemented, 04_let.ron should now work and display:

------------ running MARS, all following output is from program ------------
this is a variable.

04_let.asm’s main should look something like:

    # Stack Frame:
    #   16 bytes total, 16 of which are variables/saved registers
    #   -12(fp): msg


	la   s0, STRLIT$0
	sw   s0, -12(fp)
	addi sp, sp, -4
	lw   s0, -12(fp)
	sw   s0, 0(sp)
	jal  println_s

The stack frame comments tell you that -12(fp) refers to the msg variable, and sure enough the code stores a value into it, then loads it when calling println_s.

Assignment into variables (05_assign)

Now let’s do the opposite of identifier expressions: assignment into an identifier expression.

In visit_stmt’s Assign case, there is a sub-match with the two kinds of assignment LHSes: identifiers and fields. We’ll just do identifiers now.

In the ExpKind::Id(ident) sub-case, the LHS of the assignment is ident and the RHS is src. Do this:

  1. get the RHS’s value into a register by using self.visit_exp on src.
  2. get the location of ident with self.get_sym_var_loc again.
  3. use self.reg_into_loc to store the register from step 1 into the location from step 2.

Notice that 05_assign.trs does assignments into both a local and a global, and this code should work for both.

Done correctly, the output should be:

------------ running MARS, all following output is from program ------------
message 1
message 2
message 3

Check out 05_assign.asm - you’ll see some of the loads/stores are accessing -12(fp) for the msg variable, and others are accessing the global g.

Return statements (06_return)

This program has three functions, and one of them even returns a value. So let’s implement return statements.

Recall that there are really two parts to a return: first you put any return value in the v0 register; and second you jump to the end of the function.

In the Return(val) case in visit_stmt, val is an Option: Some(v) if there’s a return value, and None if not. You can match on it, but in this case an if let is simpler:

if let Some(v) = val {
    // there is a return value!

Inside that:

  1. Use self.visit_exp on v to get the register it was evaluated into;
  2. Use b.move_(Reg::V0, reg); to copy the value from that register into v0. (move is a Rust keyword so the method had to be named move_ instead, wah.)

Finally, after that if statement, use b.jump("_return");. This generates a j that jumps to the _return: label at the beginning of the function epilogue.

Done correctly, the output should be:

------------ running MARS, all following output is from program ------------
test_no_val running
test_val running

And as always, inspect the 06_return.asm to see what was generated!

Boolean literals and Unary expressions (07_unary)

Let’s knock out two really easy ones. Have a look at 07_unary.trs - it uses both unary operators, negation (-) and logical not, as well as a boolean literal.

For the BoolLit(val) case in visit_exp_into_reg, use the exact same code as you did for IntLit. Easy!

For the Unary case, we now have the first instance of an expression with another expression inside it. In order to avoid an unnecessary explosion of temporary registers, we will evaluate the lhs into the dst register. So:

  1. Use self.visit_exp_into_reg on lhs and pass dst as the second argument.
  2. Use b.inst_from_unop(*op, dst, dst); to generate the right instruction.

Check out how inst_from_unop works - it generates either a neg or seq instruction. (seq sets its destination to 1 if the two inputs are equal, and comparing a bool equal to 0 will effectively give the logical NOT. We don’t want to use a not instruction because that would give us -1, instead of 1!)

The output of 07_unary.ron should be:

x = 5
x = -5
b = true
b = false

“Normal” binary operators (08_binary and 09_concat)

You can see that visit_exp_into_reg has three cases for Binary, and two of them are special cases for BinOp::And and BinOp::Or. Leave those alone for now, and let’s work on the Binary { op, lhs, rhs } case first.

Now we have two sub-expressions, lhs and rhs. Our strategy will be to use the existing register for lhs, and use a new register for rhs. So our asm output will be something like:

	# evaluating 5 + 3 into s0:
	li  s0, 5
	li  s1, 3
	add s0, s0, s1

To do this we will:

  1. Use self.visit_exp_into_reg on lhs into dst, just like for unary operators.
  2. Use self.visit_exp for the rhs, to put it into a new register.
  3. Use b.inst_from_binop(*op, dst, dst, rhs_reg); to generate the instruction.

The output from 08_binary.ron should be:

5 + 3 = 8
5 - 3 = 2
5 * 3 = 15
5 / 3 = 1
5 % 3 = 2
5 < 3 = false
5 > 3 = true
5 <= 3 = false
5 >= 3 = true
5 == 3 = false
5 != 3 = true

So far, so good, right? Well now look at 09_concat.trs:

fn main() {
    let hello = "hello";
    let world = "world!";
    println_s(hello + ", " + world);

    print_s("are the strings equal? ");
    println_b((hello + ", " + world) == "hello, world!");

This uses string concatenation and string equality, two operations which will require us to use some runtime support functions. If you try to run 09_concat.ron right now, the program crashes when MARS runs it!:

------------ running MARS, all following output is from program ------------
Error in 09_concat.asm line 102: Runtime exception at 0x00400148: address out of range 0x30030013

Processing terminated due to errors.

Why’s that? Well, if you look at 09_concat.asm it becomes clear when you look at the code right before jal println_s:

    lw   s0, -12(fp)
    la   s1, STRLIT$2
    add  s0, s0, s1
    lw   s1, -16(fp)
    add  s0, s0, s1
    sw   s0, 0(sp)

It’s adding the addresses of the strings, which is meaningless.

Further on, the code for the println_b((hello + ", " + world) == "hello, world!"); line is just a total mess: it adds three string addresses, then uses seq to compare the bogus address to another string’s address. This is all messed up. In Truss, string equality works like .equals() in Java.

To fix this, we’ll add two special cases in the Binary case: one for string concatenation, and one for string equality:

if *op == BinOp::Add && self.ctx.node_type( {
    // special case for string concatenation!
} else if *op == BinOp::Eq && self.ctx.node_type( {
    // special case for string equality!
} else {
    // the other code that you just wrote (self.visit_exp_into_reg etc.)

The special cases are easy: we’ll generate calls to the rt$concat and rt$streq functions by using these:

self.call_rt_concat(dst, lhs, rhs, b);


self.call_rt_streq(dst, lhs, rhs, b);

With that done, 09_concat.ron should now work and display:

hello, world!
are the strings equal? true


Indirect function calls (10_indirect)

An indirect call is one made using a function pointer variable, rather than calling the function directly by name. They are a little trickier because of the register allocation behavior.

Have a look at the source for 10_indirect.trs.

fn main() {

fn takes_a_function(f: fn(int): ()) {

fn returns_a_function(): fn(int): () {
    return println_i;

The takes_a_function function… takes a function as its argument. Since both println_i and println_c are of type fn(int): (), they can be passed as arguments to it.

So takes_a_function(println_i) will print 65, and takes_a_function(println_c) will print uppercase A (println_c prints a character).

The last line of main is the most interesting: it calls returns_a_function(), then immediately calls the function that it returned. Here, that’s println_i, so returns_a_function()(42) should print 42.

This kind of function call is handled by the indirect_call method at the end of

We have to be careful about how we generate code here, because the evaluation rules say we must evaluate the callee before the args, but if we’re not careful, evaluating the args might overwrite any temporary values that the callee created!

SO, this is what we have to do in indirect_call:

  1. visit the callee and get its register, but this time using self.visit_exp_no_free.
    • this allocates a register but does not free it.
    • that means YOU are responsible for freeing the register when you’re done using it.
    • but it also means that the value is “safe” in the register that it returned: if you evaluate anything else, that register will remain unmodified.
  2. use self.visit_args just like you did for direct_call.
  3. use b.jalr(callee_reg) using the register you got in step 1.
    • jalr is like jal, but the target is in a register instead of a function’s name.
  4. important: use b.pop_reg() to deallocate the register that was allocated by step 1.

When done, the output should be:


More interesting is the assembly. For example, takes_a_function’s body looks something like:

    lw   s0, 0(fp)
    addi sp, sp, -4
    li   s1, 65
    sw   s1, 0(sp)
    jalr s0

Notice how s0 is loaded with the argument (0(fp)), and then it starts evaluating the arguments, using s1 instead of s0. s0 was allocated to hold the callee, meaning the arguments were forced to use s1 for their evaluation. Neat!

Struct new and field access (11_fields)

Let’s get started with structs. Have a look at 11_fields.trs and see what it does. The first line of main does:

    let s = new S();

And right now, the compiler is complaining about “new struct expression”. So, let’s implement it in visit_exp_into_reg’s New(..) case.

Remember from class that allocating something on the heap is another place where we have to use a runtime function. That function is rt$new and it takes the size of the structure to allocate in bytes. So here’s how we’ll do it:

  1. use self.get_struct_sym_id(&e) to get the symbol ID of the structure that this new uses.
  2. use the self.get_struct_layout method to get the layout for the structure.
    • look up its signature. You will pass the ID that you got in step 1.
  3. use the .size() method on that structure layout to get its size in bytes.
  4. use self.call_rt_new(dst, size, b); to generate the call to rt$new.

Now we get a new panic about “assignment into field”… but we’ll deal with that in a bit.


First let’s do getting a field. In visit_exp_into_reg’s Field case, you have the obj (the thing to the left of the dot) and the name (the name to the right of the dot). Here’s how it works:

  1. visit the obj expression into the dst register, just like you did with lhs in Unary.
  2. use self.get_field_var_loc to get a VarLoc representing the field.
    • it takes the object expression (obj), the name, and then the object register (dst, here).
  3. use self.loc_into_reg on that location, just like you did for Id.

Fields are just another kind of variable, so loc_into_reg works on them too!

But we’re not done yet. We also have to implement setting a field. That’s the other sub-case in visit_stmt’s Assign, the ExpKind::Field case. This will look a bit different though.

If the code looked like this:

	f().x = g();

The evaluation rules say that f must be called before g. So we have a very similar situation to the indirect calls, where we have to evaluate the object, then keep it around in a register while we evaluate the RHS of the assignment. visit_exp_no_free to the rescue!

So for field assignments:

  1. visit the obj with visit_exp_no_free to put it into a safe register.
  2. visit the src with visit_exp to put it in any old register.
  3. self.get_field_var_loc to get the field location, like before.
    • Make sure you pass the register you got from step 1, not step 2!
  4. self.reg_into_loc like the other kind of assignment.
  5. b.pop_reg() to free the register from step 1.

Now 11_fields should work and output this:

s.x = 10
s.y = 20
s.b = false

Particularly interesting in the asm is this sequence corresponding to s.b = s.x == s.y (comments added for explanation):

    lw   s0, -12(fp) # s0 = s on the LHS of the assignment
    lw   s1, -12(fp) # s1 = s in s.x
    lw   s1, 0(s1)   # s1 = s.x
    lw   s2, -12(fp) # s2 = s in s.y
    lw   s2, 8(s2)   # s2 = s.y
    seq  s1, s1, s2  # s1 = s.x == s.y
    sb   s1, 4(s0)   # s.b = s.x == s.y (see how it uses 'sb' for bools)

Method calls (12_point)

12_point.trs is starting to look like a real program! You’ve implemented almost everything needed for it to work except method calls. Fortunately, I realized that doing dynamic method dispatch with vtables was a bit too ambitious for this language, so this will actually be really easy. ;o

What we are doing instead is static dispatch. When we see a method call like p.print(), we will look up p’s struct (Point here), and call the print method declared inside that struct. It’s that simple.

At the bottom of is the method for the third and final kind of call, method_call. Its arguments are:

Just three steps here:

  1. use self.get_struct_method_name to get the assembly name of the method.
    • E.g. the print method in Point is named Point$print: in the asm.
  2. use self.visit_args like before, but this time use Some(obj) as the first argument.
  3. use b.jal(&method_name); on the name you got in step 1.

That’s it! The program will output:

p = Point(5, 10)
now p = Point(25, 10)
manhattan length of Point(25, 10) = 35
squared length of Point(25, 10) = 725

Check out how the Point methods are named in the output asm as well.

while loops (13_while)

These last few things are control flow. That means you could, if you make a mistake, make a program that gets stuck in an infinite loop. While testing my project, I found that usually ctrl+C would kill MARS when it looped infinitely, but sometimes it would not, and I had to kill MARS from my Task Manager (or Activity Monitor). Be careful!

Control flow is definitely the most complex part of codegen, so let’s start with the simplest kind, a while loop. But before we get into it, we have to talk about labels.

Labels, Jumps, and Branches

The builder has a method, b.make_label_name which will generate a unique name for a label. For example, b.make_label_name("while_cond") might return a string like _while_cond_3. The number is arbitrary.

This does not place the label into the code, though. To do that, you use b.label(&name). That will actually write the label with a colon after it in the code. But be careful: if you use b.label(&whatever) more than once for the same label, MARS will complain about duplicate label names like:

Error in ../programs/whatever.asm line 221 column 1: label "main$$$_while_cond_4" already defined

So: call b.label once for each label that was made with b.make_label_name.

Finally, the build has three methods for making jumps and branches which take label names as arguments:

Because all conditions in our language are bool, and bools are represented as 1 or 0, that’s all the branches we need!

Back on track: making while loops work

In visit_stmt’s While case, you have cond (the condition) and code (the code inside) to work with. What you want to generate is some asm that looks like:

    # <the code for `cond`, generated into some s reg, say s0>
    beq  s0, zero, _while_end_2 # if the condition is 0 (false), exit the loop
    # <the code for `code`>
    j    _while_cond_1          # continue the loop

So here’s what you’ll do:

  1. make two label names using b.make_label_name("while_cond") and b.make_label_name("while_end").
  2. place the while_cond label with b.label.
  3. visit cond with visit_exp and get its register.
  4. branch to the while_end label if that register is equal to 0 (use b.beqz).
    • note: I don’t mean doing if reg == Reg::Zero, I mean beqz compares the register to 0.
  5. visit the code statement.
  6. jump to the while_cond label (b.jump).
  7. place the while_end label with b.label.

Whew. Lots of steps, but if you compare your code to what is output, it should hopefully make sense which part maps to which. Done correctly, this program will output:

i = 0
i = 1
i = 2
i = 3
i = 4
i = 5
i = 6
i = 7
i = 8
i = 9

If it doesn’t loop (just prints out Done!), maybe you used b.bnez instead of b.beqz, or maybe you used the wrong register as input to b.beqz. You should use the register returned from self.visit_exp on the cond.

If it loops infinitely, uh, I’m not sure what might be wrong! 💦

if statements (14_if)

14_if.trs tests out if statements. less_than_10 has an if with no else; fact has an if-else; and three_way has an if-else if-else.

if statements aren’t really that much more complicated than while statements, it’s just that there are two kinds: those with an else and those without an else, so the codegen is a bit bigger.

Remember that else_ is an Option so you will have to match it like match else_.

Rather than walking you through it, I will give you the two patterns for the two flavors of if and see if you can work out how to do it using what you learned from while loops. (And good coding practice means doing common tasks outside the match else_…)

An if without an else looks like this:

    # <the code for `cond`>
    beq  s0, zero, _if_end_4
    # <the code for `then`>

An if with an else looks like this:

    # <the code for `cond`>
    beq  s0, zero, _if_else_3
    # <the code for `then`>
    j    _if_end_4
	# <the code for `else_`>

Done correctly, the program will output:

yes, 3 < 10.
fact(5) = 120 (should be 120)
x is small.
x is medium.
x is large.

A small tangent (15_linked_list)

If you try to run 15_linked_list, it will complain about null being unimplemented. null is really easy: in visit_exp_into_reg’s Null case, load an immediate 0 into the dst reg, similarly to the IntLit case.

Now you should now be able to run 15_linked_list. Have a look at its .trs source, predict what it will do, then run it to confirm. It should output:

1 -> 2 -> 3
removing second node
1 -> 3

Wow, you can do Real Stuff in this language! We’ve got data structures!!

Logical and and or (16_and_or)

Finally we get to finish visit_exp_into_reg! The two special cases for Binary are the last things to fill in.

The Binary { op: BinOp::And case is for logical and, and it works like this:

  1. use visit_exp_into_reg on the lhs, and give it dst, like Unary.
    • since the lhs is guaranteed to give a bool, we can evaluate it directly into our register.
  2. if the dst register is 0, branch past step 3.
    • so, you’ll have to make a label…
  3. use visit_exp_into_reg on the rhs and give it dst.

Control flow inside an operator! What kind of madness is this!

Logical or works almost exactly the same, except the condition is inverted. So use b.bnez instead.

16_and_or is a silly program and it outputs this when done correctly:

test(false, false):
  neither x nor y!
test(true, false):
  x or y
test(false, true):
  x or y
test(true, true):
  x and y
  x or y
compound(false, false, false):
compound(false, true, false):
compound(true, true, false):
  you got the right combination!
compound(false, false, true):
  you got the right combination!

for loops (17_for)

Time for the final part. for loops are the most complex control flow statement. When you write

for i in 0, 10 {


  1. evaluates the lower bound (here, 0) and initializes i with it
  2. evaluates the upper bound (here, 10) and stores it in a safe register
  3. works like a while i < upper loop
  4. also increments i at the bottom of the loop

So there are several things going on!

Step 2 points out something important: the upper bound is only evaluated once and its value is kept in a register for the duration of the loop. So in this code:

fn main() {
    for i in 0, upper() {}

fn upper(): int {
    println_s("upper was called!");
    return 10;

It will only print upper was called! once at the start of the loop, not 10 times.

So let’s do it!

  1. use self.visit_local_var_decl to visit the var member.
    • But this time, save its return value in a variable!
    • This is the VarLoc that represents the loop counter variable, and you’ll be using it a few times.
  2. use self.visit_exp_no_free to visit hi and save that register in a variable.
    • you may as well put b.pop_reg() at the end of this case now, so you don’t forget it later.
  3. like with while loops, make labels for the condition and end.
  4. place the condition label down with b.label.
  5. now we want to do the equivalent of while var < hi, but we have to implement it manually:
    • use let reg = b.cur_reg(); to get a free register that you can use.
    • use self.loc_into_reg(reg, var_loc.clone(), b); to load the counter variable into reg.
    • then b.set_less_than(reg, reg, hi_reg); to compare reg to the value in hi_reg
    • and finally, if reg is 0, branch to the end label.
  6. visit the code.
  7. now we need to increment the counter, but again we have to implement it manually:
    • use let reg = b.cur_reg(); to get a free register again.
    • do the self.loc_into_reg thing again to load the counter variable.
    • use to generate an instruction that increments reg by 1.
    • finally, use self.reg_into_loc(var_loc, reg, b); to store the value back into the variable.
  8. jump to the condition…
  9. …and place the end label.
  10. oh yeah, and if you forgot to do it before, put b.pop_reg() here.

Check out 17_for.trs. When you run 17_for.ron, it should print:

upper was called!
i = 0
i = 1
i = 2
i = 3
i = 4
i = 5
i = 6
i = 7
i = 8
i = 9

There are many points of failure here, so I can’t predict them all.

Victory lap (18_ast, 19_mystery)

With that, you’re done. You can run these two programs to just make sure everything is working.

For fun: completing the compiler

You do not need to do this! This is just for fun after you’ve completed the project. Also, if you have problems with this, please hold off asking questions until after the project due date.

First, copy over the appropriate files (and only the listed files) from your previous projects as detailed in this earlier section.

Then, if all your passes work, you should now be able to compile and run the programs directly from the source code! All you have to do is pass the source code (whatever.trs) to compile instead of the ron file:

$ cargo run --bin compile -- -r programs/03_hello.trs
    Finished dev [unoptimized + debuginfo] target(s) in 0.46s
     Running `target/debug/compile -r programs/03_hello.trs`
------------ running MARS, all following output is from program ------------
hello, world!

Then you can change the test programs or even write your own! Isn’t it cool? idk I think it’s cool

If you get errors from the lexer about invalid characters (like (1:12) invalid character) when you try to do this: there’s a bug in the lexer which interacts with the way git handles line endings on Windows that didn’t manifest until now.

To fix this: go into src/lex/ and find the eat_whitespace_and_comments method. Inside the match, the first case checks for whitespace. Change it to this:

' ' | '\t' | '\n' | '\r' =>

The missing '\r' is the bug.