Compilers take high-level code (like C) and turn it into machine code. MIPS was designed to be easy for compilers to generate code for.

You can learn the same rules that compilers use to turn any high-level pseudocode into MIPS just by following some rules. They’re BRAIN ALGORITHMS.

Contents:


Variables

Assuming you have some global variables like this:

.data
    x: .word 10
    y: .word 20
    z: .byte 30
.text
Description C MIPS Notes

Get value

x
z
lw t0, x
lb t1, z # or lbu

The type of load instruction must match
the variable type.
For half/byte,
you decide whether they’re signed.

Set value

x = 30

z = 10

li t0, 30
sw t0, x
li t0, 10
sb t0, z

The type of store instruction must match
the variable type.

Read-modify-write

x++


lw  t0, x
add t0, t0, 1
sw  t0, x

Combine the above two.

Copy variable

x = y

lw t0, y
sw t0, x

Same idea.


Control flow

while(true) and do-while

The easiest kinds of loops.

Description C MIPS Notes

Infinite loop

while(true) {
    println("hi!")
}
_loop_top:
    println_str "hi!"
    j _loop_top

To break, go to a label after
the j.

do-while

do {
    t0++;
} while(t0 < 10);
_loop_top:
    add t0, t0, 1
    blt t0, 10, _loop_top

Like an infinite loop, but
replace j with a branch.

if and while with conditions

These two are similar, but you have to flip the conditions.

Description C MIPS Notes

simple if

if(t0 == 10) {
    println("yes!")
}
    bne t0, 10, _not_10
    println_str "yes!"
_not_10:

Invert the condition.
Branch after the if.

simple while

while(t0 != 10) {

    t0++;
}

_loop:
    bne t0, 10, _not_10
    add t0, t0, 1
    j   _loop
_not_10:

Invert the condition.
Branch after the loop.
j to the condition.

for loops

These examples use t registers as the loop counters, but in many cases s registers are more appropriate. See this section of the cookbook to choose which kind.

Description C MIPS Notes

“correct” for

for(i = 0; i < 10; i++) {
    // code
}






    li  t0, 0
_loop:
    bge t0, 10, _break

    # code

    add t0, t0, 1
    j   _loop
_break:

Invert the condition.
Loops 0 or more times, like
you’re used to.

“cheesy” for

for(i = 0; i < 10; i++) {
    // code
}



    li t0, 0
_loop:
    # code

    add t0, t0, 1
    blt t0, 10, _loop

Shorter, and simpler, but…
Loops 1 or more times, unlike
a real for loop.

But if the loop bound is a
constant, no problem.

switch-case

Think of them like a highway with exits. You pass the exits you don’t care about until you get to the one you do.

The default case ends up at the top, unlike where we usually write it in HLLs. Of course you could just put a _default label at the bottom and jump to it.

Description C MIPS Notes

switch

switch(x) {
    case 0:
        ...
        break;
    case 1:
        ...
        break;
    case 2:
        ...
        break;
    case 3:
        ...
        break;
    default:
        ...
        break;
}





    lw t0, x
    beq t0, 0, _case_0
    beq t0, 1, _case_1
    beq t0, 2, _case_2
    beq t0, 3, _case_3

    # default is *HERE* instead
    ...
    j _break
_case0:
    ...
    j _break
_case1:
    ...
    j _break
_case2:
    ...
    j _break
_case3:
    ...
    j _break
_break:

♫♩


Conditions

Remember that in HLLs, && and || use short-circuit evaluation:

Description C MIPS Notes

&&

if(x >= 10 && x < 20) {
    println("in range");
}




    lw t0, x
    blt t0, 10, _out_of_range
    bge t0, 20, _out_of_range

    println_str "in range"

_out_of_range:

Invert both conditions.
Branch after the if.
Think of it like, “if any of these
are not satisfied, don’t run the
code inside.”

||

if(x == 10 || x == 11) {
    println("one of em")
}




    lw t0, x
    beq t0, 10, _one_of_them
    bne t0, 11, _neither
_one_of_them:
    println_str "one of em"

_neither:

Invert the last condition.
Think of it like, “if any of these
is satisfied, run the code inside.”

Another way of thinking of && and ||

&& is really like a nested if.

// same as (x >= 10 && x < 20)
if(x >= 10) {
    if(x < 20) {
        println("in range");
    }
}

|| is like… well, if you’re using == it’s like a switch-case. It’s more complicated otherwise, but I think this is enough to show the idea:

// same as (x == 10 || x == 11)
switch(x) {
    case 10:
    case 11:
        println("one of em")
        break;
}

// NOT VALID C/JAVA but okay in some other languages:
switch {
    case x < 3:
    case x > 10:
        println("outside range [3..10]");
        break;
}

1D Arrays

Arrays are just variables sitting next to each other. They’re spooning.

Making an array: int array[] = {1, 2, 3, 4, 5};

.data
    array: .word 1, 2, 3, 4, 5
.text

Getting a value: array[i]

Assume i is represented by s0.

    la  t0, array  # t0 = A:  get base address
    mul t1, s0, 4  # t1 = Bi: multiply index by size of *one item in the array*
    add t0, t0, t1 # t0 = A + Bi
    lw  t1, (t0)   # t1 = array[i]

If the array is an array of bytes, you can skip the mul step - since you’d be multiplying by 1.

Setting a value: array[i] = 10

Assume i is represented by s0.

    # exactly the same address calculation as above!
    la  t0, array  # t0 = A:  get base address
    mul t1, s0, 4  # t1 = Bi: multiply index by size of one item in the array
    add t0, t0, t1 # t0 = A + Bi

    # just this differs.
    li  t1, 10     # t1 = 10
    sw  t1, (t0)   # array[i] = 10

2D Arrays

2D arrays are just 1D arrays sitting next to each other. They’re spooning arrays, full of spooning variables. Meta-spooning.

These are “row-major” arrays:

Making a 2D array: int array[3][3] = { {1, 2, 3}, {4, 5, 6}, {7, 8, 9} };

.data
    array: .word 1, 2, 3, 4, 5, 6, 7, 8, 9

    # or, if you like, you can put newlines
    array: .word
        1, 2, 3
        4, 5, 6
        7, 8, 9
.text

Calculating the address of an item: a[row][col]

It’s a good idea to make constants for the dimensions and item sizes.

# width and height of the array
.eqv ARRAY_W 3
.eqv ARRAY_H 3

# size of 1 row = width * size of 1 item
# items are 4 bytes each, so 3 * 4
.eqv ARRAY_ROW_SIZE 12

To translate array[row][col], assuming row is s0 and col is s1:

    la  t0, array              # t0 = A:  get base address
    mul t1, s0, ARRAY_ROW_SIZE # t1 = Rr: multiply row by size of *one row*
    mul t2, s1, 4              # t2 = Bc: multiply col by size of *one item*
    add t0, t0, t1
    add t0, t0, t2             # t0 = A + Rr + Bc

    # now load/store at address (t0)

Function calls

For writing functions, see the cookbook instead.

There are two (or three) steps:

  1. Put the arguments in the argument registers, starting with a0
  2. Call the function
  3. If it returns something, the return value is in v0.

Calling a syscall: print_int(10)

    li a0, 10
    li v0, 1 # number of print_int syscall
    syscall

Calling a syscall that returns something: x = read_int()

    # no arguments!
    li v0, 5 # number of read_int syscall
    syscall
    sw v0, x # return value is in v0

Calling a regular function: addPoints(50)

    li  a0, 50
    jal addPoints

    # if it returned something, it would be in v0.
    # because YOU put it there. right? ;D

Structs

A struct is a group of variables next to each other in memory, but unlike an array, they can be different types and sizes.

Let’s say we have this C struct:

typedef struct {
    bool active;
    bool angry;
    int x;
    int y;
    int health;
} Enemy;

Here is a small C program which will show the size of the entire struct, and the offsets of each variable from the beginning of the struct. Click the “run” button to see the output.

Unfortunately the MARS assembler has no direct support for structs, so we have to do them manually.

Following the C compiler’s lead, we can declare some constants for the field offsets and struct size:

.eqv Enemy_active 0
.eqv Enemy_angry  1
.eqv Enemy_x      4
.eqv Enemy_y      8
.eqv Enemy_health 12
.eqv Enemy_sizeof 16

If we want one copy of the struct, we can declare a variable of it like this:

.data
    one_enemy: .space Enemy_sizeof

If we want an array of the struct, we have to calculate the size ourselves, as the assembler has no support for arithmetic expressions (most do):

.data
    .eqv NUM_ENEMIES 10 # length of the array

    array_of_enemies: .space 160 # == NUM_ENEMIES * Enemy_sizeof

    # alternatively you could write ".space Enemy_sizeof" 10 times
    # but that's....... well........................... weird

In either case, we access the struct like this:

  1. get a pointer to the beginning of the struct
  2. access its fields using the offset(reg) form of loads and stores

For example:

    la  t0, one_enemy

    # one_enemy.active = true
    li  t1, 1
    sb  t1,   Enemy_active(t0) # adds field offset to base address

    # one_enemy.angry = false
    sb  zero, Enemy_angry(t0)

So whenever we have a struct pointer in a register, we write Struct_field(reg) to load/store its fields.

for loops work like for any other type, but remember that each item is going to be Enemy_sizeof bytes apart.

    li  s0, 0
_loop_top:
        # calculate pointer to array_of_enemies[i]
        la  t0, array_of_enemies
        mul t1, s0, Enemy_sizeof
        add t0, t0, t1

        # array_of_enemies[i].x++
        lw  t1, Enemy_x(t0)
        add t1, t1, 1
        sw  t1, Enemy_x(t0)

    add s0, s0, 1
    blt s0, NUM_ENEMIES, _loop_top

Of course, you could also use the “walking pointer” technique to iterate over an array of structs, and in fact that might be easier - you just have one add t0, t0, Enemy_size per iteration.