This week you’ll learn some of the basic things that you’ll need for many programs in the future, such as:


1. Printing integers

Let’s start by just printing some integers. (Strings are a little more advanced.)

Maybe you should open up your lab 0 in MARS to refer back to it!

  1. Make a new file in MARS.
  2. save it in an appropriate location as abc123_lab1.asm where abc123 is your username.
  3. Just like lab 0, put your full name and username in comments at the top, and make your main function label.
    • Don’t forget the .global main.
  4. Assemble to make sure you got the syntax right.
  5. Inside main (that is, after the main: label), type this (and remember to keep the indentation nice):

         # print 123
         li a0, 123
         li v0, 1
         syscall
    
  6. Assemble and run, and you’ll see the number 123 printed in the “Run I/O”.

What’s going on here? What is a0? What is v0? What is syscall?

As you’ll find out in 449, programs have to ask the operating system to get input or produce output, and we do that with system calls. MARS pretends like it’s a tiny operating system, so it has many built-in system calls to print things out, ask the user to type something, etc.

We choose which system call to run by putting a number into the v0 register before executing the syscall instruction.

From now on, I want you to treat the li v0, whatever and syscall as two halves of the same thing. You do not put any code between the li v0, whatever and the syscall. They go together. Never use syscall without first doing li v0, whatever.

In MARS, go to the “Help > Help” menu item. Then click “Syscalls” on the lower set of tabs. This tells you about what syscalls are, which ones are available, what their numbers are, and their arguments and return values.

Notice it says this at the top of the table:

Service Code
in $v0
Arguments Result
print integer 1 $a0 = integer to print  

There are lots of other possible values you can put into v0 to choose different syscalls, but syscall 1 means “print an integer.”

Then, the arguments to the syscalls go in the a registers. That’s why they are named a0, a1, etc. a is for argument.

So to sum up: the three lines above correspond roughly to this Java code:

System.out.print(123);

2. Printing more integers and newlines

Right now, your program just prints out 123. Now:

  1. Add code to do the equivalent of System.out.print(456) after the first print. (It will be another 3 lines.)
    • Don’t forget to comment your code so you can tell what it does! Write something like # print(123) before the first set of lines, and something similar for the second set.
  2. Run it and look at the output.

It comes out as 123456. Well, that’s confusing! But that’s because you printed both numbers on the same line. Unfortunately there is no equivalent of System.out.println. So, we have to print the newline ourselves.

  1. Between the two prints (right after the first syscall), we are going to do another syscall:
    • for the argument a0, use '\n' (that’s single quotes around a backslash and n)
    • for the syscall number v0, use 11 (look up what syscall 11 does in the MARS help)
  2. Run it again.

You should now have 123 and then 456 on another line after it. Nice!


3. Integer input and math

Syscall 5 works a bit like Scanner.nextInt() in Java. It:

  1. After all the code you’ve written so far, do this:
    • print another newline with syscall 11 like you did before (you can copy and paste those lines)
    • Use syscall number 5. It takes no arguments, so you don’t have to put anything in the a0 register before using it.
  2. Run your program, and now instead of stopping, it shows:
     123
     456
     |
    

    with a flashing cursor on the line after 456, which is it waiting for you to type something.

  3. Click on the “Run I/O” box, type a number, and hit enter. The program will end.
  4. Look at the contents of the v0 register on the right – it should hold the value that you typed in.

If you type something other than a number and hit enter, or hit enter without typing anything, you’ll get something like Runtime exception at 0x00400034: invalid integer input (syscall 5). That’s okay, that’s not a bug or anything. It’s just that syscall 5 is picky and you have to type an integer.

Well if we can now get integers from the user, let’s do something with them. Let’s ask the user for two numbers, add them together, and print the sum.

Right now, at the end of main, v0 contains the number that the user typed in. We want to ask the user for another number, so you have to do syscall 5 again.

But wait. In order to do syscall 5, you have to set v0 to 5. That will destroy the first number that the user just typed in, right?

So you need to put that number somewhere SSSSSssssSSSSSSSSSSsssssafe. I sound like a snake because I’m talking about the s registers. They’re useful for saving values for later.

  1. After that syscall 5, put the value that the user typed in into s0.
    • it’s not move v0, s0. That’s backwards.
  2. Then do syscall 5 a second time.
  3. At this point, you have the two values you want to add together in s0 and v0.
    • Go to Help > Help, and near the top of the “Basic Instructions” list, you’ll see add $t1, $t2, $t3.
      • That’s not literally what you type in though. You can use any registers in place of $t1, $t2, and $t3.
      • Read the description of the instruction and see what order things go in.
    • You want to add those two values together and then print out the sum with syscall 1, so… which register should the sum go into so that you can print it out?
  4. Finally, after the add line, do syscall 1 again to print out the sum.

When done correctly, your program should ask for two numbers, then print out their sum, like:

123
456
1000
533
1533
-- program is finished running (dropped off bottom) --

4. Printing strings

Now we can work on printing an actual “hello world” message.

What are strings, anyway?

The first 128 characters of Unicode are ASCII, so technically any ASCII string is also Unicode.

Strings are arrays of characters, and each character is really a number with an agreed-upon meaning. That “agreement” is called a character encoding and says things like “the number 97 means lowercase a.” The most widespread encoding today is Unicode, but there is an older encoding called ASCII which many programs in the English-speaking world use. Go have a look at this table, paying attention to the Dec and Chr columns.

This means that "abc" is really an array of 3 numbers: {97, 98, 99}. But you also need to know how long the string is, or at least where the end of it is. The convention in both the C language and MARS is to use a zero terminator: a character with the value 0 at the end of the string. That means "abc" is represented like this in memory (shown in hexadecimal):

0x61 0x62 0x63 0x00

In ASCII, each character - that is, each number - is one Byte. (That’s how many bits?)

Can you put a string in a register?

I said in lab 0 that registers are like the CPU’s hands. Just like your hands, they’re limited in size. You can’t fit a car in your hands. But you can use your hands to point to a car. 👉🚗 You can use registers to point to something too.

Strings are arrays, and arrays are too big to fit in reigsters. Instead, we can only put the address of a string in a register. The address is where the string is located in memory, and it is a 32-bit number in the version of MIPS that we’re using. Conveniently, our registers are 32 bits too!

So no, you can’t put a string in a register, but you can put a string’s address in one.

Hello world at last

In Java, you can just write strings anywhere you like, and the compiler takes care of all of that crap for you. But in asm, we have to be a little more… literal.

Did you just scroll down to this and skip all the stuff above? Of course you didn’t! You totally read all the stuff I wrote above. All that information I could definitely ask about on an exam. Yep. You read it!

  1. Before your .global main line:
    • Add a .data line. This tells the assembler, “switch to the data segment.”
      • The data segment is the part of memory where we put… data. Variables, strings, etc.
    • After that line, write this: hello_msg: .asciiz "hello, world!\n"
      • .asciiz says, “encode the following string as ascii, with a zero terminator.”
      • You know what \n does, right?
    • After that line, put a .text line. This tells the assembler, “switch to the text segment.”
      • The text segment is where code goes.
  2. In main at the start of your program (that is, right after main: before the code you wrote previously):
    • Write la a0, hello_msg
      • la stands for load address. It puts the address of a label into a register.
      • So, this will put the address of hello_msg into a0.
    • Then, do syscall 4.
  3. Assemble and run.
    • Now the first thing it prints should be your “hello, world!” message, before printing and asking for numbers. So it should look like this (ofc what you type in for the numbers is up to you):

        hello, world!
        123
        456
        10
        20
        30
        -- program is finished running (dropped off bottom) --
      
    • You can also assemble and step one instruction at a time - if you turn on “Hexidecimal Values”, you can see the address 0x10010000 get put into a0.

Try this: what error do you get if you comment out the .text line? Well now you know what to do if you see that error. :) Similarly, try commenting out the .data line. Those directives are easy to forget.

A few things to notice

While you are on the Execute tab, there are three things I want you to look at:

  1. In the Text Segment panel at the top, you can see the instructions that you wrote, but you can also see the “Basic” column looks a little different.
    • If you can’t see the instructions in the Text Segment panel… make MARS wider. Lol.
    • Both la and li are pseudoinstructions: “fake” instructions that the assembler accepts and rewrites to simpler instructions that a MIPS CPU can actually understand.
    • la is rewritten to two instructions, lui and ori. It’s fine. Sometimes that happens.
  2. In the Labels panel, you should see one entry: hello_msg 0x10010000
    • That 0x10010000 is the memory address that the assembler gave to the hello_msg string.
    • Memory addresses are virtually always displayed in hexadecimal.
  3. If you check the ASCII box at the bottom of the Data Segment panel, you can see your string at the top-left!
    • You can see 0x10010000 - the address - on the left side.
    • You can also see that after the end of the string is \0 - that’s the zero terminator. (There are many more \0 bytes in memory after it, but that one was put there on purpose by the assembler.)


5. String input

Now we’ll have the user type in their name and greet them. Real CS 0007 stuff. You are going to change your program so that the input/output look like this instead of just saying “hello, world!”

enter your name: Jarrett
hello, Jarrett
123
456
5
7
12
-- program is finished running (dropped off bottom) --

So you need to replace your current code that prints "hello, world!\n" with:

  1. Print "enter your name: ".
  2. Read a string from the user (explained below).
  3. Print "hello, "
    • (there’s no string concatenation so we have to print out each part of this line separately.)
  4. Print the string they typed in on step 2.

Reading a string from the user

In Java, you would use Scanner.nextLine() which returns a string, but “returning a string” is kind of complicated in a low-level language. So instead, the way this works is, you make space for a string, then you tell the syscall where that space is, and it will put what the user typed in that space.

So to read a string from the user:

So to later print out what they typed in… just use syscall 4 with the address of input_buffer as the first argument.


6. Exiting gracefully

Finally, you’ll learn how to exit your program more “properly”.

At the end of your main function (so, the end of your program), use the exit syscall (number 10). It takes no arguments, so you don’t have to put anything in the a0 register.

When you run now, your output should now look something like this (depending on what you type in for the numbers to add):

enter your name: Jarrett
hello, Jarrett
123
456
11
22
33
-- program is finished running --

So, the final message changed to no longer say “dropped off bottom.” It might not seem that important now, but it’ll save you a lot of trouble later if you end every main function with the exit syscall.


Submitting

Upload to Gradescope, once it’s open.

The last submission you upload is the one we grade.