Exam 2 Review Day Notes

Please read this page about taking my exams!

Exam format

When/where
- Tuesday April 22nd at 11:00 AM OR
- Tuesday April 29th at 10:00 AM (though we might not start exactly at 10:00)
- here, in this room
- 75 minutes
- it is not going to be “too long to finish”
Length
- 3 sheets of paper, double-sided
- there are A Number of Questions and I cannot tell you how many because it is not a useful thing to tell you because they are all different kinds and sizes.
  - But I will say that I tend to give many, smaller questions instead of a few huge ones.
Topic point distribution
- More credit for earlier topics
- Less credit for more recent ones
- More credit for things I expect you to know because of your experience (labs, projects)
- Only on lectures 12 through 24 inclusive
Kinds of questions
- A fair number of multiple choice and/or “pick n“
- Some fill in the blanks or matching
  - mostly for vocabulary
  - or things that I want you to be able to recognize, even if you don’t know the details
- Several short answer questions
  - again, read that page above about answering short answer questions!!
- No writing code from scratch, but:
  - tracing (reading code and saying what it does)
  - debugging (spot the mistake)
Major themes
- the program toolchain
- how programs are executed
- processes and virtual memory
- system calls, kernel mode, and drivers/kernel modules
- multiprocessing (multithreading, IPC, and synchronization)

Things people asked about in the reviews

Remember, these are just the things that people asked about. There may be topics on the exam not on this list; and there may be topics on this list that are not on the exam.

Shell scripts
- bash, the shell that you are used to using on thoth, has its own programming language
- a shell script is, in the simplest form, just a list of commands to execute in bash
```
  #!/bin/bash

  gcc -Wall -Werror -g -O3 -o -shared mylibrary mylibrary.c
  gcc -Wall -Werror -g -O3 -o myprogram myprogram.c
  ./myprogram mylibrary.so
```
- the filename should end in .sh
- you have to chmod +x whatever.sh
- then you can run ./whatever.sh
#define
- the preprocessor does automated copy-and-paste on the text of your source code
- you give commands to the preprocessor with the # lines (“preprocessor directives”)
  - e.g. #include <whatever.h> copies and pastes the contents of whatever.h right there
- #define lets you replace any word-like thing with any sequence of text
  - e.g. #define NUM_ITEMS 100, then everywhere you use NUM_ITEMS, the preprocessor replaces it with 100
- you can also define macros: #define MIN(a, b) (((a) < (b)) ? (a) : (b))
  - when you use MIN(x, y), then the preprocessor replaces it with (((x) < (y)) ? (x) : (y))
  - commonly used for tiny functions that don’t make sense as full functions, buuuuuuuuut these days compiler optimization is usually good enough that you don’t HAVE to do this
  - BUT macros are type-agnostic and you can do fun things with them that would be literally impossible in C
Function pointers
- hold the address of a function
- syntax is unmistakeable: int (*p)();
- can be used for allllllllll sorts of things
  - you can make generic functions (e.g. qsort) that take a function pointer as a predicate - a function that makes a key decision in some algorithm (like sorting, filtering, finding the min or max of an array of items, that kind of thing)
  - you can do OOP - all OOP method calls are, are calling a function pointer that is inside the object.
  - you can dynamically load functions out of shared objects (dlopen, dlsym) - this lets you make plugins!
Linking and loading (static, dynamic linking, dynamic loading)
- There are three times when linking can happen:
  - Static linking occurs after compilation, and it combines the object files and any statically-linked libraries into an executable file or shared object file.
  - Dynamic linking happens while an executable is being loaded into memory to be run. The loader sees which dynamically-linked libraries (i.e. .so files) the executable depends on, and loads them into memory along with the contents of the executable.
  - Dynamic loading happens while an executable is ALREADY running. In this case, the process asks the OS to dynamically load and link libraries on its behalf, using the dlopen and dlsym functions.
- Static linking pros and cons:
  - a fully statically-linked program is a “complete puzzle” and has no dependencies, which makes it easy to install and run
    - no “missing shared libraries” problems - if the program can be loaded, it will run
  - once a statically linked program is created, it will always work the same way forever
    - (assuming that the system calls that it makes keep working the same way)
  - static linking usually results in much larger executables
    - which can be a problem if, for example, every executable statically links the same large library - like the C runtime library, which would use up a lot of hard drive and memory space
  - statically linking in a buggy library means that those bugs in the library are embedded inside the executable
    - for example if there is a bug in version 6 of libc, and your program statically links it in, it doesn’t matter if libc version 7 comes out, your executable is stuck with version 6
- Dynamic linking pros and cons:
  - dynamically linked executables can be much smaller
    - not just on the hard drive either - multiple processes can use the same shared library code at runtime, saving memory
  - when a bug is fixed in a dynamically linked library, the next time you run the executable, the fixed version is linked in
    - this is particularly important for bugs that expose programs to security vulnerabilities through no fault of their own!
  - …but sometimes, fixing bugs and changing other behaviors of a dynamically linked library can break the programs that depend on them
    - not every program is perfect, and sometimes they rely on incorrect or unspecified behavior of a library
    - newer library versions can break what was a perfectly-executing program before
    - this is why OSes often keep multiple versions of libraries (e.g. libc.so.4, libc.so.5, libc.so.6)
  - and if a dynamically linked library cannot be found, the program cannot be run
    - it’s no longer as simple as copying the executable to another computer
    - you may also need to install those shared libraries
Symbol tables
- each object and executable and library file has a symbol table which is a list of things in that file, including:
  - what their names are
  - what they are (function, variable, etc.)
  - where they are (what address within the file)
  - where they want to be (what address in memory)
  - how big they are (how many bytes)
fork()
- all processes in POSIX are created by being cloned from an existing process
- fork() makes a perfect copy of the current process
  - the current process is called the parent
  - the newly-created process is called the child
  - (note that there is no “hierarchy” between these processes, the parent doesn’t have “special privileges” over the child or anything. it’s just a naming thing)
- fork() returns an integer, and this is the only place where the parent and child processes diverge:
  - in the parent process, fork() returns the child process’s PID
  - in the child process, fork() returns 0
- from there, the typical next step is for the child process to use one of the exec*() family of functions to transform itself into a new program
  - but this isn’t required - e.g. in a multi-process program (like a web browser), the child process might just work with the parent process without changing code
The POSIX filesystem and files
- the POSIX filesystem is really just a big hierarchical namespace.
- / is the root
  - inside / are directories like /usr/, /bin/, /home/, etc.
  - inside those can be more directories, and files, etc. ad infinitum
  - there are also symlinks (symbolic links) that are like pointers pointing from one location in the file tree to another location
- each directory and file COULD be:
  - a real-ass directory or file stored in the persistent storage (hard drive); or
  - fake things provided by the kernel or one of its modules to allow userspace programs to see and interact with the kernel through the file concept
    - e.g. /dev/, /sys/, /proc/
- the File concept
  - everything is a file!!!
  - that means that you can use the open(), close(), read(), and write() system calls on anything in the filesystem
  - essentially the concept is: everything is a stream of bytes.
  - when you open(), you are opening a communication channel between your program and that thing.
  - whenever you write(), you are sending bytes to something.
  - whenever you read(), you are receiving bytes from something.
  - and close() closes the communication channel.
- this is nice because it means we don’t have to add eighty billion syscalls to interact with every possible kernel module!
- whenever you call open(), the OS opens that file and adds it to your process’s file descriptor table and gives you the index to that file in that table
  - that’s why the file descriptors are just integers 0, 1, 2, 3, …
  - each process has a limited number of files that it can have open at once
- then when you call close() it removes that entry from your file descriptor table.
The kernel - how it works, what it’s responsible for, what it DOES, kernel modules
- the OS consists of a whole bunch of things, and the kernel is - like the name implies - the kind of “center” of it.
- the kernel is responsible for scheduling all processes running on the system.
- the kernel also keeps track of which resources on the system are being used by which processes - that’s the file descriptor table per-process, but also a bunch of other things (virtual memory map per-process, thread descriptions per-thread, etc.)
- scheduling decides who runs next, and for how long
  - modern OSes mostly schedule threads, not processes, because most modern OSes use kernel threading, where the concept of a thread is a kernel idea
- the kernel is allowed to do just about anything
  - run any instruction in the full instruction set that the CPU has
  - access any register in the CPU
  - access any piece of hardware in the computer
  - access any part of memory belonging to any process
  - access physical memory
  - respond to interrupts (asynchronous notifications from pieces of hardware that something happened)
  - etc.
- however, the kernel itself is kind of a weird programming environment
  - so if you write kernel modules, you won’t have access to the things you may be used to, like libc
  - because it’s not a user process, so you can’t do syscalls.
  - so if you want to do things like access files, you can’t ask yourself to do it. you have to go to whatever kernel module is responsible for that and ask it directly
- the kernel itself is actually very small and has very few capabilities beyond scheduling and memory management
- SO the kernel can load special kinds of shared objects into itself - .ko files, which are kernel modules
  - these can extend the capabilities of the kernel by adding more code for doing just about anything you can think of
  - a device driver is a kind of kernel module that is responsible for controlling some piece of hardware
  - other kernel modules can do things like expose special CPU capabilities to user mode that user mode programs would normally not be allowed to do
    - often CPUs have sets of instructions and registers that are only accessible in kernel mode, and so these modules can expose those features to user mode in a limited way that checks that the user process isn’t using them to do anything ad
  - other kernel modules can implement things like networking, IPC, file systems (that is, the data structures that actually store files on the hard drive), etc.
  - every kernel module can expose these capabilties in /dev/, /sys/, /proc/ as “files” which user processes can access to communicate with them
  - but kernel modules execute in kernel mode, which can make them dangerous if you don’t know what you’re doing (or if it’s actually malware in disguise)!
User mode drivers
- Only kernel mode can access hardware.
- Kernel-mode drivers are developed as kernel modules so they can access the hardware. But this has problems:
  - Kernel modules must be written in the language that the kernel supports (usually C)
  - The kernel is a strange programming environment and requires a lot of care to do things right
  - Doing things wrong in kernel mode will crash the entire system
  - Debugging kernel modules is possible, but tricky
  - Distributing kernel modules requires blessings from OS manufacturers (MS, Apple)
- User-mode drivers work like this:
  - There is a kernel module which acts as an arbiter and allows hardware access indirectly.
  - The user-mode driver is a daemon/service - process that runs in the background - and talks to that kernel module by being given some special permissions to do so
  - When some user program wants to talk to a piece of hardware, it:
    - Makes a syscall to the kernel module
    - The kernel module forwards that request to the daemon
    - The demon handles the request, and has the kernel module access the hardware on its behalf
    - The kernel module returns the requested data to the user program
  - This way you can write the module in any language you like, in a normal programming environment where you can access files, networking etc., they’re easy to debug, crashing doesn’t break anything, and they’re easy to distribute
  - However, they require a lot of context switching which slows them down, and they cannot directly respond to hardware interrupts - there may be a significant delay between the interrupt coming in and the daemon processing it, which may be unacceptable for some kinds of hardware.
Virtual Memory
- Physical memory (the actual memory installed in the computer) is limited, and different processes need different amounts of memory.
  - Also, the kernel is responsible for isolating processes from one another.
- Virtual memory is the solution to these and many other problems. It abstracts away the physical addresses from the processes, and processes only ever see virtual addresses.
  - Each process has its own virtual address space, an imaginary 2ⁿ-1 bytes of memory.
  - However, the kernel manages a page table for each process.
  - On every single load, store, and instruction fetch, the virtual address is translated into the physical address in memory using the page table.
  - This is done by the CPU hardware.
- For example, a process may believe it is accessing address 0x80004010, but that virtual address gets translated to a totally different physical address, like 0xC2853010.
  - This is totally invisible to the process.
- This allows the kernel to:
  - manage the physical memory much more easily (e.g. contiguous virtual memory is not necessarily contiguous in physical memory!)
  - only allow each process to access its own memory - processes are not even capable of seeing other processes’ virtual address spaces
  - kill any process that tries to access memory that doesn’t exist in its virtual address space - that’s what a segfault is
  - do Fun Shenanigans with the underlying physical memory, like copying it out to the hard drive (that’s 1550 tho)
  - set up shared memory - 2 processes, 2 virtual pages which map to same physical page - now they can communicate without kernel involvement
Signals
- An asynchronous way for the OS to notify programs of certain special events
- The program can register a “signal handler” - a special function that is called when the signal comes in
- When the signal happens, normal execution is paused and the signal handler is called; once it finishes, normal execution resumes
- Some signals are for harmless things like SIGINT (user pressed Ctrl+C) or SIGTERM (user asked the program to exit nicely, like File > Quit)
- Some are for more serious things like SIGSEGV (segfault) or SIGKILL (instantly kills process, cannot be handled like other signals)
- There are some standard POSIX signals and OSes also often define their own custom signals for system-specific events
x86/x64, the System V ABI
- no. not on the exam
handling blocking system calls in user threading (shims)
- no. not on the exam
Race Conditions
- a situation where the outcome depends on an unpredictable ordering of events
- these come up often in multiprocessing (multithreading, or multiple processes, or even in real life - any time you have multiple “actors” sharing some resource)
- e.g.
  - you have two threads both doing their own thing
  - those two threads try to access the same variable at the same time
  - if they don’t synchronize their accesses, then the ordering of the steps of the two threads can become interleaved in such a way that the value of the variable is set to some strange value.
  - T1: for(i = 0 to 100000) var++
  - T2: for(i = 0 to 100000) var--
  - you would expect var to be 0 at the end of the program, BUUUUUUUUT each increment is multiple steps and the steps of the two threads can interleave in weird ways, e.g.:
    - var == 5 at the beginning, then:
    T1: lw t0, var # T1's t0 == 5 T2: lw t0, var # T2's t0 == 5 T1: add t0, t0, 1 # T1's t0 == 6 T1: sw t0, var # var == 6 T2: sub t0, t0, 1 # T2's t0 == 4 T2: sw t0, var # var == 4
    - and now the value is wrong. we did “one increment” and “one decrement” but var went from 5 to 4.
  - importantly, you don’t control the order in which those steps are run because you aren’t in control of the thread scheduling. so sometimes, the steps are run in the “right” order, and sometimes they aren’t.
- synchronization lets us force steps to be executed atomically - without being interrupted.
  - the simplest way to do this is with a mutex - each thread locks the mutex before doing its steps, and unlocks it after.
  - the code between the lock and unlock is called the critical section.
  - if another thread tries to lock an already-locked mutex, it is blocked and has to wait until the first thread unlocks it.
  - this has the effect of making the sequence of steps in the critical section atomic - they will be executed in order as a group, and no other thread can “squeeze in” to execute conflicting steps at the same time.
  - note that other threads can still run while the mutex is locked just fine. what they can’t do is both run the same critical section at the same time.
Semaphores
- a mutex only has 2 states: unlocked and locked
- a semaphore is another synchronization primitive that is a generalization of a mutex to n states, for any n ≥ 2
- e.g. let’s say it starts off at 4
  - locking it decrements it to 3…
  - but it’s not “fully locked” yet. a second thread can lock it at the same time, decrementing it to 2
  - it only becomes “fully locked” when it’s decremented to 0
- so a semaphore that starts off at 4 can be simultaneously locked by up to 4 threads
  - and a 5th thread attempting to lock it will sleep until it’s unlocked, just like a mutex
- semaphores might be a better fit for some kinds of problems
- but they can also be reimplemented in terms of mutexes + cond vars
  - all the synchronization primitives are pretty much isomorphic - they can be implemented in terms of each other
  - but there might be different performance characteristics

⬅ Exam 2 Review Day Notes

plus some more stuff!

Exam format

Things people asked about in the reviews