Compilers take high-level code (like C) and turn it into machine code. MIPS was designed to be easy for compilers to generate code for.

You can learn the same rules that compilers use to turn any high-level pseudocode into MIPS just by following some rules. They’re BRAIN ALGORITHMS.

Contents:


Variables

Assuming you have some global variables like this:

.data
    x: .word 10
    y: .word 20
    z: .byte 30
.text
Description C MIPS Notes

Get value

x
z
lw t0, x
lb t1, z # or lbu

The type of load instruction must match
the variable type.
For half/byte,
you decide whether they’re signed.

Set value

x = 30

z = 10

li t0, 30
sw t0, x
li t0, 10
sb t0, z

The type of store instruction must match
the variable type.

Read-modify-write

x++


lw  t0, x
add t0, t0, 1
sw  t0, x

Combine the above two.

Copy variable

x = y

lw t0, y
sw t0, x

Same idea.


Control flow

while(true) and do-while

The easiest kinds of loops.

Description C MIPS Notes

Infinite loop

while(true) {
    // loop code
}
_loop_top:
    # loop code
j _loop_top

To break, go to a label after
the j at the end.

do-while

do {
    t0++;
} while(t0 < 10);
_loop_top:
    add t0, t0, 1
blt t0, 10, _loop_top

Like an infinite loop, but
replace j with a branch.
The condition is normal.

if and while with conditions

These two are similar, but you have to flip the conditions.

Description C MIPS Notes

simple if

if(t0 == 10) {
    // then code
}
bne t0, 10, _endif
    # then code
_endif:

Invert the condition.
Branch after the “then” code.

simple while


while(t0 != 10) {
    // code
    t0++;
}

_loop:
    beq t0, 10, _break
    # code
    add t0, t0, 1
    j   _loop
_break:

Invert the condition.
Branch after the loop.
j to the condition.

if-else and if-else if-else if...

Don’t forget that you have to jump over the else.

Description C MIPS Notes

if-else

if(t0 == 10) {
    // then code

} else {
    // else code
}
bne t0, 10, _else
    # then code
j _endif # note!!!
_else:
    # else code
_endif:

Invert the condition.
Branch to the else.
Jump over the else.

if-else if-else

if(t0 == 0) {
    // A code


} else if(t0 == 1) {
    // B code

} else {
    // C code
}
bne t0, 0, _else_if
    # A code
j _endif # note!!!
_else_if:
bne t0, 1, _else
    # B code
j _endif # note!!!
_else:
    # C code
_endif:

Invert all conditions.
Branch to the next condition.
Jump to the end of the chain
at the end of each case.

for loops

These examples use t registers as the loop counters, but in many cases s registers are more appropriate. See this section of the cookbook to choose which kind.

Description C MIPS Notes

“simple” for

for(i = 0; i < 10; i++) {
    // code
}





li t0, 0
_loop:
    # code

# increment is the
# LAST THING!!!!!
add t0, t0, 1
blt t0, 10, _loop

Loops 1 or more times, unlike
a “real” for loop.

But I’d say 95% of the time,
this way of writing fors is
perfectly okay.

“correct” for

for(i = 0; i < 10; i++) {
    // code
}







li  t0, 0
_loop:
    bge t0, 10, _break
        # code

    # increment is the
    # LAST THING!!!
    add t0, t0, 1
    j _loop
_break:

Invert the condition.
Loops 0 or more times, like
you’re used to. But it’s a lot
more work to write, and uses an
inverted condition, which sucks.

switch-case

Think of them like a highway with exits. You pass the exits you don’t care about until you get to the one you do.

Description C MIPS Notes

switch

switch(x) {





    case 0:
        ...
        break;
    case 1:
        ...
        break;
    case 2:
        ...
        break;
    case 3:
        ...
        break;
    default:
        ...
        break;
}
lw t0, x
beq t0, 0, _case_0
beq t0, 1, _case_1
beq t0, 2, _case_2
beq t0, 3, _case_3
j _default
_case0:
    ...
    j _break
_case1:
    ...
    j _break
_case2:
    ...
    j _break
_case3:
    ...
    j _break
_default:
    ...
    # j _break not needed
_break:

♫♩


Conditions

Remember that in HLLs, && and || use short-circuit evaluation:

Description C MIPS Notes

&&

if(x >= 10 && x < 20) {
    println("in range");
}




lw t0, x
blt t0, 10, _out_of_range
bge t0, 20, _out_of_range

    println_str "in range"

_out_of_range:

Invert both conditions.
Branch after the if.
Think of it like, “if any of these
are not satisfied, don’t run the
code inside.”

||

if(x == 10 || x == 11) {
    println("one of em")
}




lw t0, x
beq t0, 10, _one_of_them
bne t0, 11, _neither
_one_of_them:
    println_str "one of em"

_neither:

Invert the last condition.
Think of it like, “if any of these
is satisfied, run the code inside.”

Another way of thinking of && and ||

&& is really like a nested if.

// same as (x >= 10 && x < 20)
if(x >= 10) {
    if(x < 20) {
        println("in range");
    }
}

|| is like… well, if you’re using == it’s like a switch-case. It’s more complicated otherwise, but I think this is enough to show the idea:

// same as (x == 10 || x == 11)
switch(x) {
    case 10:
    case 11:
        println("one of em")
        break;
}

// NOT VALID C/JAVA but okay in some other languages:
switch {
    case x < 3:
    case x > 10:
        println("outside range [3..10]");
        break;
}

1D Arrays

Arrays are just variables sitting next to each other. They’re spooning.

Making an array: int array[] = {1, 2, 3, 4, 5};

.data
    array: .word 1, 2, 3, 4, 5
.text

Getting a value: array[i]

Assume i is represented by s0. This is the short way that uses the special form of lw to skip the la and add:

    mul t0, s0, 4     # t0 = Si: multiply index by size of *one item in the array*
    lw  t1, array(t0) # t1 = array[i]

This is the long way that does the explicit address calculation:

    la  t0, array  # t0 = A:  get base address
    mul t1, s0, 4  # t1 = Si: multiply index by size of *one item in the array*
    add t0, t0, t1 # t0 = A + Si
    lw  t1, (t0)   # t1 = array[i]

If the array is an array of bytes, you can skip the mul step - since you’d be multiplying the index by 1. That means in the short form, accessing an array of bytes is just one line!

Setting a value: array[i] = 10

Assume i is represented by s0.

This is the short way that uses the special form of lw to skip the la and add:

    li  t1, 10        # t1 = 10
    mul t0, s0, 4     # t0 = Si: multiply index by size of *one item in the array*
    sw  t1, array(t0) # array[i] = 10

This is the long way that does the explicit address calculation:

    # exactly the same address calculation as above!
    la  t0, array  # t0 = A:  get base address
    mul t1, s0, 4  # t1 = Si: multiply index by size of one item in the array
    add t0, t0, t1 # t0 = A + Si

    # just this differs.
    li  t1, 10     # t1 = 10
    sw  t1, (t0)   # array[i] = 10

2D Arrays

2D arrays are just 1D arrays sitting next to each other. They’re spooning arrays, full of spooning variables. Meta-spooning.

These are “row-major” arrays:

Making a 2D array: int array[3][3] = { {1, 2, 3}, {4, 5, 6}, {7, 8, 9} };

.data
    array: .word 1, 2, 3, 4, 5, 6, 7, 8, 9

    # or, if you like, you can put newlines
    array: .word
        1, 2, 3
        4, 5, 6
        7, 8, 9
.text

Calculating the address of an item: a[row][col]

It’s a good idea to make constants for the dimensions and item sizes.

# width and height of the array
.eqv ARRAY_W 3
.eqv ARRAY_H 3

# size of 1 row = width * size of 1 item
# items are 4 bytes each, so 3 * 4
.eqv ARRAY_ROW_SIZE 12

To translate array[row][col], assuming row is s0 and col is s1:

    la  t0, array              # t0 = A:  get base address
    mul t1, s0, ARRAY_ROW_SIZE # t1 = Rr: multiply row by size of *one row*
    mul t2, s1, 4              # t2 = Bc: multiply col by size of *one item*
    add t0, t0, t1
    add t0, t0, t2             # t0 = A + Rr + Bc

    # now load/store at address (t0)

Function calls

For writing functions, see the cookbook instead.

There are two (or three) steps:

  1. Put the arguments in the argument registers, starting with a0
  2. Call the function
  3. If it returns something, the return value is in v0.

Calling a syscall: print_int(10)

    li a0, 10
    li v0, 1 # number of print_int syscall
    syscall

Calling a syscall that returns something: x = read_int()

    # no arguments!
    li v0, 5 # number of read_int syscall
    syscall
    sw v0, x # return value is in v0

Calling a regular function: addPoints(50)

    li  a0, 50
    jal addPoints

    # if it returned something, it would be in v0.
    # because YOU put it there. right? ;D

Structs

A struct is a group of variables next to each other in memory, but unlike an array, they can be different types and sizes.

Let’s say we have this C struct:

typedef struct {
    bool active;
    bool angry;
    int x;
    int y;
    int health;
} Enemy;

Here is a small C program which will show the size of the entire struct, and the offsets of each variable from the beginning of the struct. Click the “run” button to see the output.

Unfortunately the MARS assembler has no direct support for structs, so we have to do them manually.

Following the C compiler’s lead, we can declare some constants for the field offsets and struct size:

.eqv Enemy_active 0
.eqv Enemy_angry  1
.eqv Enemy_x      4
.eqv Enemy_y      8
.eqv Enemy_health 12
.eqv Enemy_sizeof 16

If we want one copy of the struct, we can declare a variable of it like this:

.data
    .align 2 # IMPORTANT!!!!!!!!!!!!!!!!!!!!!!!!! ensures alignment for the 'word' variables
    one_enemy: .space Enemy_sizeof

If we want an array of the struct, we have to calculate the size ourselves, as the assembler has no support for arithmetic expressions (most do):

.data
    .eqv NUM_ENEMIES 10 # length of the array

    .align 2 # IMPORTANT!!!!!!!!!!!!!!!!!!!!!!!!!
    array_of_enemies: .space 160 # == NUM_ENEMIES * Enemy_sizeof

    # alternatively you could write ".space Enemy_sizeof" 10 times
    # but that's....... well........................... weird

In either case, we access the struct like this:

  1. get a pointer to the beginning of the struct
  2. access its fields using the offset(reg) form of loads and stores

For example:

    # t0 = &one_enemy (in C syntax)
    la  t0, one_enemy

    # one_enemy.active = true
    li  t1, 1
    sb  t1, Enemy_active(t0) # adds field offset to base address

    # one_enemy.angry = false
    sb  zero, Enemy_angry(t0)

So whenever we have a struct pointer in a register, we write Struct_field(reg) to load/store its fields.

for loops work like for any other type, but remember that each item is going to be Enemy_sizeof bytes apart. For this reason, a “walking pointer” loop is often a better fit.

    li  s0, 0 # i = 0
    la  s1, array_of_enemies # start the pointer at the beginning of the array
_loop_top:
        # array_of_enemies[i].x++
        lw  t0, Enemy_x(s1)
        add t0, t0, 1
        sw  t0, Enemy_x(s1)

    add s1, s1, Enemy_sizeof # walk the pointer forward by the size of one enemy
    add s0, s0, 1 # i++
    blt s0, NUM_ENEMIES, _loop_top