Compilers take high-level code (like C) and turn it into machine code. MIPS was designed to be easy for compilers to generate code for.
You can learn the same rules that compilers use to turn any high-level pseudocode into MIPS just by following some rules. They’re BRAIN ALGORITHMS.
Contents:
Variables
Assuming you have some global variables like this:
.data
x: .word 10
y: .word 20
z: .byte 30
.text
Description | C | MIPS | Notes |
---|---|---|---|
Get value |
|
|
The type of load instruction must match |
Set value |
|
|
The type of store instruction must match |
Read-modify-write |
|
|
Combine the above two. |
Copy variable |
|
|
Same idea. |
Control flow
while(true)
and do-while
The easiest kinds of loops.
Description | C | MIPS | Notes |
---|---|---|---|
Infinite loop |
|
|
To break, go to a label after |
|
|
|
Like an infinite loop, but |
if
and while
with conditions
These two are similar, but you have to flip the conditions.
Description | C | MIPS | Notes |
---|---|---|---|
|
|
|
Invert the condition. |
|
|
|
Invert the condition. |
if-else
and if-else if-else if...
Don’t forget that you have to jump over the else
.
Description | C | MIPS | Notes |
---|---|---|---|
|
|
|
Invert the condition. |
|
|
|
Invert all conditions. |
for
loops
These examples use t
registers as the loop counters, but in many cases s
registers are more appropriate. See this section of the cookbook to choose which kind.
Description | C | MIPS | Notes |
---|---|---|---|
“simple” |
|
|
Loops 1 or more times, unlike |
“Java-like” |
|
|
Invert the condition. |
switch-case
Think of them like a highway with exits. You pass the exits you don’t care about until you get to the one you do.
Description | C | MIPS | Notes |
---|---|---|---|
|
|
|
♫♩ |
Conditions
Remember that in HLLs, &&
and ||
use short-circuit evaluation:
- in
a && b
, ifa
is false, thenb
is never evaluated - in
a || b
, ifa
is true, thenb
is never evaluated
Description | C | MIPS | Notes |
---|---|---|---|
|
|
|
Invert both conditions. |
|
|
|
Invert the last condition. |
Another way of thinking of &&
and ||
&&
is really like a nested if.
// same as (x >= 10 && x < 20)
if(x >= 10) {
if(x < 20) {
println("in range");
}
}
||
is like… well, if you’re using ==
it’s like a switch-case
. It’s more complicated otherwise, but I think this is enough to show the idea:
// same as (x == 10 || x == 11)
switch(x) {
case 10:
case 11:
println("one of em")
break;
}
// NOT VALID C/JAVA but okay in some other languages:
switch {
case x < 3:
case x > 10:
println("outside range [3..10]");
break;
}
1D Arrays
Arrays are just variables sitting next to each other. They’re spooning.
Making an array: int array[] = {1, 2, 3, 4, 5};
.data
array: .word 1, 2, 3, 4, 5
.text
Getting a value: array[i]
Assume i
is represented by s0
. This is the short way that uses the special form of lw
to skip the la
and add
:
mul t0, s0, 4 # t0 = Si: multiply index by size of *one item in the array*
lw t1, array(t0) # t1 = array[i]
This is the long way that does the explicit address calculation:
la t0, array # t0 = A: get base address
mul t1, s0, 4 # t1 = Si: multiply index by size of *one item in the array*
add t0, t0, t1 # t0 = A + Si
lw t1, (t0) # t1 = array[i]
If the array is an array of bytes, you can skip the mul
step - since you’d be multiplying the index by 1. That means in the short form, accessing an array of bytes is just one line!
Setting a value: array[i] = 10
Assume i
is represented by s0
.
This is the short way that uses the special form of lw
to skip the la
and add
:
li t1, 10 # t1 = 10
mul t0, s0, 4 # t0 = Si: multiply index by size of *one item in the array*
sw t1, array(t0) # array[i] = 10
This is the long way that does the explicit address calculation:
# exactly the same address calculation as above!
la t0, array # t0 = A: get base address
mul t1, s0, 4 # t1 = Si: multiply index by size of one item in the array
add t0, t0, t1 # t0 = A + Si
# just this differs.
li t1, 10 # t1 = 10
sw t1, (t0) # array[i] = 10
2D Arrays
2D arrays are just 1D arrays sitting next to each other. They’re spooning arrays, full of spooning variables. Meta-spooning.
These are “row-major” arrays:
- the items in each row row are stored sequentially in memory.
- to get from one column to the next, you add the size of one item.
- to get from one row to the next, you add the number of bytes in a row…
- …which is the width times the size of one item.
Making a 2D array: int array[3][3] = { {1, 2, 3}, {4, 5, 6}, {7, 8, 9} };
.data
array: .word 1, 2, 3, 4, 5, 6, 7, 8, 9
# or, if you like, you can put newlines
array: .word
1, 2, 3
4, 5, 6
7, 8, 9
.text
Calculating the address of an item: a[row][col]
It’s a good idea to make constants for the dimensions and item sizes.
# width and height of the array
.eqv ARRAY_W 3
.eqv ARRAY_H 3
# size of 1 row = width * size of 1 item
# items are 4 bytes each, so 3 * 4
.eqv ARRAY_ROW_SIZE 12
To translate array[row][col]
, assuming row
is s0
and col
is s1
:
la t0, array # t0 = A: get base address
mul t1, s0, ARRAY_ROW_SIZE # t1 = Rr: multiply row by size of *one row*
mul t2, s1, 4 # t2 = Bc: multiply col by size of *one item*
add t0, t0, t1
add t0, t0, t2 # t0 = A + Rr + Bc
# now load/store at address (t0)
Function calls
For writing functions, see the cookbook instead.
There are two (or three) steps:
- Put the arguments in the argument registers, starting with
a0
- Call the function
- If it returns something, the return value is in
v0
.
Calling a syscall: print_int(10)
li a0, 10
li v0, 1 # number of print_int syscall
syscall
Calling a syscall that returns something: x = read_int()
# no arguments!
li v0, 5 # number of read_int syscall
syscall
sw v0, x # return value is in v0
Calling a regular function: addPoints(50)
li a0, 50
jal addPoints
# if it returned something, it would be in v0.
# because YOU put it there. right? ;D
Structs
A struct is a group of variables next to each other in memory, but unlike an array, they can be different types and sizes.
Let’s say we have this C struct:
typedef struct {
bool active;
bool angry;
int x;
int y;
int health;
} Enemy;
Here is a small C program which will show the size of the entire struct, and the offsets of each variable from the beginning of the struct. Click the “run” button to see the output.
Unfortunately the MARS assembler has no direct support for structs, so we have to do them manually.
Following the C compiler’s lead, we can declare some constants for the field offsets and struct size:
.eqv Enemy_active 0
.eqv Enemy_angry 1
.eqv Enemy_x 4
.eqv Enemy_y 8
.eqv Enemy_health 12
.eqv Enemy_sizeof 16
If we want one copy of the struct, we can declare a variable of it like this:
.data
.align 2 # IMPORTANT!!!!!!!!!!!!!!!!!!!!!!!!! ensures alignment for the 'word' variables
one_enemy: .space Enemy_sizeof
If we want an array of the struct, we have to calculate the size ourselves, as the assembler has no support for arithmetic expressions (most do):
.data
.eqv NUM_ENEMIES 10 # length of the array
.align 2 # IMPORTANT!!!!!!!!!!!!!!!!!!!!!!!!!
array_of_enemies: .space 160 # == NUM_ENEMIES * Enemy_sizeof
# alternatively you could write ".space Enemy_sizeof" 10 times
# but that's....... well........................... weird
In either case, we access the struct like this:
- get a pointer to the beginning of the struct
- access its fields using the
offset(reg)
form of loads and stores
For example:
# t0 = &one_enemy (in C syntax)
la t0, one_enemy
# one_enemy.active = true
li t1, 1
sb t1, Enemy_active(t0) # adds field offset to base address
# one_enemy.angry = false
sb zero, Enemy_angry(t0)
So whenever we have a struct pointer in a register, we write Struct_field(reg)
to load/store its fields.
for
loops work like for any other type, but remember that each item is going to be Enemy_sizeof
bytes apart. For this reason, a “walking pointer” loop is often a better fit.
li s0, 0 # i = 0
la s1, array_of_enemies # start the pointer at the beginning of the array
_loop:
# array_of_enemies[i].x++
lw t0, Enemy_x(s1)
add t0, t0, 1
sw t0, Enemy_x(s1)
add s1, s1, Enemy_sizeof # walk the pointer forward by the size of one enemy
add s0, s0, 1 # i++
blt s0, NUM_ENEMIES, _loop