Please read this page about taking my exams!
Exam format
- When/where
- Morning (11AM) section: Monday April 27 at 12:00 PM in the normal classroom.
- Afternoon (2:30PM) section: Tuesday April 28 at 2:00 PM in the normal classroom.
- 75 minutes
- it is not going to be “too long to finish”
- Length
- 3 sheets of paper, double-sided
- there are A Number of Questions and I cannot tell you how many because it is not a useful thing to tell you because they are all different kinds and sizes.
- But I will say that I tend to give many, smaller questions instead of a few huge ones.
- Topic point distribution
- More credit for earlier topics
- Less credit for more recent ones
- More credit for things I expect you to know because of your experience (labs, projects)
- Only on lectures 10 through 23 inclusive
- Kinds of questions
- A fair number of multiple choice and/or “pick n“
- Some fill in the blanks or matching
- mostly for vocabulary
- or things that I want you to be able to recognize, even if you don’t know the details
- Several short answer questions
- again, read that page above about answering short answer questions!!
- No writing code from scratch, but:
- tracing (reading code and saying what it does)
- debugging (spot the mistake)
- Major themes
- the program toolchain
- how programs are executed
- processes and virtual memory
- system calls, kernel mode, and drivers/kernel modules
- multiprocessing (multithreading, IPC, and synchronization)
Things people asked about in the reviews
Remember, these are just the things that people asked about. There may be topics on the exam not on this list; and there may be topics on this list that are not on the exam.
#define- the preprocessor does automated copy-and-paste on the text of your source code
- you give commands to the preprocessor with the
#lines (“preprocessor directives”)- e.g.
#include <whatever.h>copies and pastes the contents ofwhatever.hright there
- e.g.
#definelets you replace any word-like thing with any sequence of text- e.g.
#define NUM_ITEMS 100, then everywhere you useNUM_ITEMS, the preprocessor replaces it with 100
- e.g.
- you can also define macros:
#define MIN(a, b) (((a) < (b)) ? (a) : (b))- when you use
MIN(x, y), then the preprocessor replaces it with(((x) < (y)) ? (x) : (y)) - commonly used for tiny functions that don’t make sense as full functions, buuuuuuuuut these days compiler optimization is usually good enough that you don’t HAVE to do this
- BUT macros are type-agnostic and you can do fun things with them that would be literally impossible in C
- when you use
- Function pointers
- hold the address of a function
- syntax is unmistakeable:
int (*p)(); - can be used for allllllllll sorts of things
- you can make generic functions (e.g.
qsort) that take a function pointer as a predicate - a function that makes a key decision in some algorithm (like sorting, filtering, finding the min or max of an array of items, that kind of thing) - you can do OOP - all OOP method calls are, are calling a function pointer that is stored inside the object.
- you can dynamically load functions out of shared objects (
dlopen,dlsym) - this lets you make plugins!
- you can make generic functions (e.g.
- Compilation toolchain
- sequence of steps that goes from source code to running process
- Preprocessing (automated copy-and-paste)
- Compiling (source code => machine code)
- “Assembling” is kinda in there too but we didn’t really talk about it
- Linking (object files + libraries => executable file)
- Loading (unpacking executable file into memory, starting process)
- Linking and loading (static, dynamic linking, dynamic loading)
- Linking is how you put pieces of a program together
- There are three times when linking can happen:
- Static linking occurs after compilation, and it combines the object files and any statically-linked libraries into an executable file or shared object file. The contents of the object files/librares are copied into the executable file.
- Dynamic linking happens while an executable is being loaded into memory to be run. The loader sees which dynamically-linked libraries (i.e.
.sofiles) the executable depends on, and loads them into memory along with the contents of the executable. - Dynamic loading happens while an executable is ALREADY running. In this case, the process asks the OS to dynamically load and link libraries on its behalf, using the
dlopenanddlsymfunctions.- useful for PLUGINS - loading optional functionality, or for extending programs to do more than they were originally programmed to do
- because it’s under programmer control, if loading fails, it can fall back onto other code, or give an error, or something else - unlike with dynamic linking, which is under OS control, and if it fails, the whole program just never runs.
- Static linking pros and cons:
- a fully statically-linked program is a “complete puzzle” and has no dependencies, which makes it easy to install and run
- no “missing shared libraries” problems - if the program can be loaded, it will run
- once a statically linked program is created, it will always work the same way forever
- (assuming that the system calls that it makes keep working the same way)
- Updates to other parts of the system (e.g. the standard C library) will not cause a statically linked executable to stop functioning properly, because it has the specific version it depends upon embedded within itself
- static linking usually results in much larger executables
- which can be a problem if, for example, every executable statically links the same large library - like the C runtime library, which would use up a lot of hard drive and memory space
- statically linking in a buggy library means that those bugs in the library are embedded inside the executable
- for example if there is a bug in version 6 of libc, and your program statically links it in, it doesn’t matter if libc version 7 comes out, your executable is stuck with version 6
- only way to fix that bug is to recompile, relink, and redistribute the program to everyone who has it installed
- a fully statically-linked program is a “complete puzzle” and has no dependencies, which makes it easy to install and run
- Dynamic linking pros and cons:
- dynamically linked executables can be much smaller
- not just on the hard drive either - multiple processes can use the same shared library code at runtime, saving memory (thanks, virtual memory! can map same physical pages into multiple processes’ vmem map)
- when a bug is fixed in a dynamically linked library, the next time you run the executable, the fixed version is linked in
- this is particularly important for bugs that expose programs to security vulnerabilities through no fault of their own!
- …but sometimes, fixing bugs and changing other behaviors of a dynamically linked library can break the programs that depend on them
- not every program is perfect, and sometimes they rely on incorrect or unspecified behavior of a library
- newer library versions can break what was a perfectly-executing program before
- this is why OSes often keep multiple versions of libraries (e.g.
libc.so.4, libc.so.5, libc.so.6)
- and if a dynamically linked library cannot be found, the program cannot be run
- it’s no longer as simple as copying the executable to another computer
- you may also need to install those shared libraries
- dynamically linked executables can be much smaller
- Symbol tables
- each object and executable and library file has a symbol table which is a list of things in that file, including:
- what their names are
- what they are (function, variable, etc.)
- where they are (what address within the file)
- where they want to be (what address in memory)
- how big they are (how many bytes)
- each object and executable and library file has a symbol table which is a list of things in that file, including:
fork()andexec\*()- all processes in POSIX are created by being cloned from an existing process
fork()makes a perfect copy of the current process- the current process is called the parent
- the newly-created process is called the child
- (note that there is no “hierarchy” between these processes, the parent doesn’t have “special privileges” over the child or anything. it’s just a naming thing)
fork()returns an integer, and this is the only place where the parent and child processes diverge:- in the parent process,
fork()returns the child process’s PID - in the child process,
fork()returns 0
- in the parent process,
- from there, the typical next step is for the child process to use one of the
exec*()family of functions to transform itself into a new program- but this isn’t required - e.g. in a multi-process program (like a web browser), the child process might just work with the parent process through IPC without changing code
- The POSIX filesystem and files
- the POSIX filesystem is really just a big hierarchical namespace.
/is the root- inside
/are directories like/usr/,/bin/,/home/, etc. - inside those can be more directories, and files, etc. ad infinitum
- there are also symlinks (symbolic links) that are like pointers pointing from one location in the file tree to another location
- inside
- each directory and file COULD be:
- a real-ass directory or file stored in the persistent storage (hard drive); or
- fake things provided by the kernel or one of its modules to allow userspace programs to see and interact with the kernel through the file concept
- e.g.
/dev/,/sys/,/proc/
- e.g.
- the File concept
- everything is a file!!!
- that means that you can use the
open(),close(),read(), andwrite()system calls on anything in the filesystem - essentially the concept is: everything is a stream of bytes.
- when you
open(), you are opening a communication channel between your program and that thing. - whenever you
write(), you are sending bytes to something. - whenever you
read(), you are receiving bytes from something. - and
close()closes the communication channel.
- this is nice because it means we don’t have to add eighty billion syscalls to interact with every possible kernel module!
- whenever you call
open(), the OS opens that file and adds it to your process’s file descriptor table and gives you the index to that file in that table- that’s why the file descriptors are just integers 0, 1, 2, 3, …
- each process has a limited number of files that it can have open at once
- then when you call
close()it removes that entry from your file descriptor table.
- User mode vs. kernel mode
- CPU has 2 modes it can run in (technically it could be more but 2 is the most common and the minimum needed)
- user mode is more restricted - uses virtual addresses, can’t access all instructions (privileged instructions will crash program), can’t access hardware, can’t access any other process’s memory (locked in VMem jail)
- kernel mode is unrestricted - uses physical addresses, CAN access all instructions, CAN access hardware, CAN access all memory belonging to all processes
- Syscalls
- The one and only way for user processes to ask the OS to do something on their behalf
- e.g. I/O, modifying virtual memory map, exiting
- a syscall causes a context switch from user mode to kernel mode
- and starts executing at the kernel’s syscall handler, always
- there are a fixed number of syscalls to do a limited set of things
- therefore it’s avantageous to implement some things on top of existing abstractions
- e.g. using the file abstraction to open communication channels to kernel modules or to other processes (
open/read/write/closesyscalls)
- The one and only way for user processes to ask the OS to do something on their behalf
- The kernel - how it works, what it’s responsible for, what it DOES, kernel modules
- the OS consists of a whole bunch of things, and the kernel is - like the name implies - the kind of “center” of it.
- the kernel is responsible for controlling access to the computer’s resources, such as:
- time on the CPU (through scheduling)
- memory (through virtual memory)
- files (through the file system)
- networking connections, screen, keyboard, etc. etc. etc. (through device drivers)
- the kernel also abstracts the hardware away from the user processes
- so e.g. same program will work on both Intel and AMD, or both x86 and ARM, or both 32- and 64-bit, or systems with 4GB memory and systems with 128GB of memory, or or or…
- or preventing user processes from accessing hardware that they shouldn’t
- or to present virtual devices that don’t really exist
- the kernel keeps track of which resources on the system are being used by which processes - that’s the file descriptor table per-process, but also a bunch of other things (virtual memory map per-process, thread descriptions per-thread, etc.)
- scheduling decides who runs next, and for how long
- modern OSes mostly schedule threads, not processes, because most modern OSes use kernel threading, where the concept of a thread is a kernel idea
- the kernel is allowed to do just about anything
- run any instruction in the full instruction set that the CPU has
- access any register in the CPU
- access any piece of hardware in the computer
- access any part of memory belonging to any process
- access physical memory
- respond to interrupts (asynchronous notifications from pieces of hardware that something happened)
- etc.
- however, the kernel itself is kind of a weird programming environment
- so if you write kernel modules, you won’t have access to the things you may be used to, like libc
- because it’s not a user process, so you can’t do syscalls.
- so if you want to do things like access files, you can’t ask yourself to do it. you have to go to whatever kernel module is responsible for that and ask it directly
- the kernel itself is actually very small and has very few capabilities beyond scheduling and memory management
- SO the kernel can load special kinds of shared objects into itself -
.kofiles, which are kernel modules- these can extend the capabilities of the kernel by adding more code for doing just about anything you can think of
- a device driver is a kind of kernel module that is responsible for controlling some piece of hardware
- other kernel modules can do things like expose special CPU capabilities to user mode that user mode programs would normally not be allowed to do
- often CPUs have sets of instructions and registers that are only accessible in kernel mode, and so these modules can expose those features to user mode in a limited way that checks that the user process isn’t using them to do anything ad
- other kernel modules can implement things like networking, IPC, file systems (that is, the data structures that actually store files on the hard drive), etc.
- every kernel module can expose these capabilties in
/dev/,/sys/,/proc/as “files” which user processes can access to communicate with them - but kernel modules execute in kernel mode, which can make them dangerous if you don’t know what you’re doing (or if it’s actually malware in disguise)!
- Block vs. character devices
- Character devices are a one-directional stream of bytes
- no concept of “position” or “address”
- you just write bytes (characters) into it one at a time, or read them out one at a time - they are unidirectional
- think of a tube, pipe, wire, whatever
- “data that comes in/goes out over time”
- do NOT think “character = thing you write in notebook” - common misconception in previous terms
- it is “character” in the sense of a C
char- a byte-size integer type.
- it is “character” in the sense of a C
- Block devices have a concept of a “position” or “address”
- it’s a big ol block of bytes
- think of a notebook with numbered pages
- “data that exists over a span of space”
- you can go to a position and store or load data from that position
- Character devices are a one-directional stream of bytes
- User mode drivers
- Only kernel mode can access hardware.
- Kernel-mode drivers are developed as kernel modules so they can access the hardware. They are dynamically loaded into the kernel as needed. Pros and cons:
- Pro: fast. When user process needs to access hardware, just 2 context switches needed - one to get into the driver, and another to come back out.
- Pro: can handle hardware interrupts, which are needed for some hardware in order to support very low-latency operations (e.g. time-critical networking operations, high-speed data transfers)
- Con: Kernel modules must be written in the language that the kernel supports (usually C)
- Con: The kernel is a strange programming environment - limited memory, no standard library, totally different ways to do things like access files or networking than you would use in user mode
- Con: Doing things wrong in kernel mode will crash the entire system
- Con: Debugging kernel modules is possible, but much more difficult than simply firing up
gdb - Con: Distributing kernel modules requires blessings from OS manufacturers (MS, Apple), and must be installed by a system administrator (computer owner)
- User-mode drivers work like this:
- There is a kernel module (the “driver framework”, e.g. libusb, DriverKit) which acts as an arbiter and allows hardware access indirectly. It is there to talk to the hardware and to shuttle data between the user-mode driver and other processes.
- The user-mode driver is a daemon/service - process that runs in the background - and talks to that kernel module by being given some special permissions to do so (e.g. accessing certain special virtual files in
/dev, /sys, /procwhich are connected to that kernel module) - When some user program wants to talk to a piece of hardware, it:
- Makes a syscall to the kernel module
- The kernel module forwards that request to the daemon
- The demon handles the request, and has the kernel module access the hardware on its behalf
- The kernel module returns the requested data to the user program
- Pros and cons of user-mode drivers:
- Pro: easier to write (can use any programming language; can use any libraries and OS features that you need and you use them like you’re writing any other user-mode app; can debug them with all the debugging tools you’re used to)
- Pro: much easier to distribute (don’t need permission from OS writers; possibly don’t even need administrative privileges to install the driver, as long as the framework is already installed)
- theoretically such a driver could even be cross-platform - e.g. write it in Python to use libusb, and it’ll work on any computer that has Python and libusb installed!
- Pro: much less likely to cause problems (user-mode process can only do so much harm - only has access to limited things; a crashing user-mode process is no big deal, just restart it)
- Con: slower (requires many context switches to do their work - hopping between requesting process, framework, daemon, framework, daemon, framework, requesting process etc. just to do a single request)
- Con: cannot use hardware interrupts (or at least, cannot respond to them with low latency - since they’re user-mode processes, they will only run when the kernel decides to get around to scheduling them)
- Virtual Memory
- Physical memory (the actual memory installed in the computer) is limited, and different processes need different amounts of memory.
- Also, the kernel is responsible for isolating processes from one another.
- Virtual memory is the solution to these and many other problems. It abstracts away the physical addresses from the processes, and processes only ever see virtual addresses.
- Each process has its own virtual address space, an imaginary 2n bytes of memory.
- However, the kernel manages a page table for each process.
- On every single load, store, and instruction fetch, the virtual address is translated into the physical address in memory using the page table.
- This is done by the CPU hardware.
- This introduces a layer of indirection into the memory addresses accessed by processes, and we all know that every problem in CS can be solved with another layer of indirection!
- For example, a process may believe it is accessing address
0x80004010, but that virtual address gets translated to a totally different physical address, like0xC2853010.- This is totally invisible to the process.
- This allows the kernel to:
- manage the physical memory much more easily (e.g. contiguous virtual memory is not necessarily contiguous in physical memory!)
- only allow each process to access its own memory - processes are not even capable of seeing other processes’ virtual address spaces
- kill any process that tries to access memory that doesn’t exist in its virtual address space - that’s what a segfault is
- do Fun Shenanigans with the underlying physical memory, like copying it out to the hard drive (that’s 1550 tho)
- set up shared memory - 2 processes, 2 virtual pages which map to same physical page - now they can communicate without kernel involvement
- Physical memory (the actual memory installed in the computer) is limited, and different processes need different amounts of memory.
- Signals
- An asynchronous way for the OS to notify programs of certain special events
- The program can register a “signal handler” - a special function that is called when the signal comes in
- When the signal happens, normal execution is paused and the signal handler is called; once it finishes, normal execution resumes
- Some signals are for harmless things like
SIGINT(user pressed Ctrl+C) orSIGTERM(user asked the program to exit nicely, like File > Quit) - Some are for more serious things like
SIGSEGV(segfault) orSIGKILL(instantly kills process, cannot be handled like other signals) - There are some standard POSIX signals and OSes also often define their own custom signals for system-specific events
- x86/x64, the System V ABI
- no. not on the exam
- Thread scheduling (user/kernel)
- scheduling is deciding who gets the CPU next, and for how long
- process scheduling is done by the kernel, virtually always preemptively
- thread scheduling can be done by either the kernel or the user process itself!
- User-mode thread scheduling (“user threading”) means the threads are switched by the user process itself
- it must be done collaboratively, because there is no way to do preemptive scheduling without a hardware timer
- i.e. each thread has to voluntarily give up control of the CPU to other threads
- but it’s usually fine, because the software author is the one who wrote all the threads - don’t have to worry about a “rogue malware thread” (usually… unless it’s running a language interpreter lmao)
- Kernel-mode thread scheduling (“kernel threading”) means the kernel is aware of the threads, and you use syscalls to create threads, and the kernel does the scheduling for you, preemptively
- downside is more context switches
- …UNLESS you have hardware acceleration, in which case no context switches are needed after the kernel sets it up, and it’s done in hardware meaning it’s even faster than user threading
- User-mode thread scheduling (“user threading”) means the threads are switched by the user process itself
- KERNEL THREADING DOES NOT MEAN THE THREADS RUN IN KERNEL MODE. IT MEANS THE KERNEL SCHEDULES THE THREADS. THERE ARE NO SECURITY CONSIDERATIONS FOR KERNEL THREADING. IT IS NOT DANGEROUS. THREADS ARE PART OF PROCESSES AND PROCESSES ALWAYS RUN IN USER MODE. AAAAAAAAAAAA!!!!!!!!!!
- scheduling is deciding who gets the CPU next, and for how long
- Race Conditions
- a situation where the outcome depends on an unpredictable ordering of events
- these come up often in multiprocessing (multithreading, or multiple processes, or even in real life - any time you have multiple “actors” sharing some resource)
- e.g.
- you have two threads both doing their own thing
- those two threads try to access the same variable at the same time
- if they don’t synchronize their accesses, then the ordering of the steps of the two threads can become interleaved in such a way that the value of the variable is set to some strange value.
- T1:
for(i = 0 to 100000) var++ - T2:
for(i = 0 to 100000) var-- - you would expect
varto be 0 at the end of the program, BUUUUUUUUT each increment is multiple steps and the steps of the two threads can interleave in weird ways, e.g.:var == 5at the beginning, then:
T1: lw t0, var # T1's t0 == 5 T2: lw t0, var # T2's t0 == 5 T1: add t0, t0, 1 # T1's t0 == 6 T1: sw t0, var # var == 6 T2: sub t0, t0, 1 # T2's t0 == 4 T2: sw t0, var # var == 4- and now the value is wrong. we did “one increment” and “one decrement” but
varwent from 5 to 4.
- importantly, you don’t control the order in which those steps are run because you aren’t in control of the thread scheduling. so sometimes, the steps are run in the “right” order, and sometimes they aren’t.
- Synchronization
- lets us force steps to be executed atomically - without being interrupted.
- the simplest way to do this is with a mutex - each thread locks the mutex before doing its steps, and unlocks it after.
- “mutex” stands for MUTual EXclusion (“one or the other, but not both”, or “at most one at a time”)
- IRL this is enforced in traffic by using stop signs/lights and drivers who know the rules of the road
- which means it’s an honor system and anyone who breaks the rules can cause problems
- which is also true in programming - any thread/process which doesn’t properly synchronize access to shared state using a mutex can cause race conditions, inconsistent state, etc.
- kind of a special boolean with two states:
- unlocked: one thread can lock it, and it now “owns” the mutex
- locked: a second (or third, or fourth…) thread tries to lock it? they are blocked
- the code between the lock and unlock is called the critical section.
- if another thread tries to lock an already-locked mutex, it is blocked and has to wait until the first thread unlocks it.
- this has the effect of making the sequence of steps in the critical section atomic - they will be executed in order as a group, and no other thread can “squeeze in” to execute conflicting steps at the same time.
- note that other threads can still run while the mutex is locked just fine. what they can’t do is both run the same critical section at the same time.
- Atomic operations
- atomic operations cannot be interrupted and must complete before anything else is allowed to use that resource (whatever that resource happens to be)
- IRL - intersection of 2 roads. atomic operation is driving through that shared intersection. enter it, drive through it, leave it. if a second car tries to do the same thing at the same time, 💥
- in code, we have sorta 2 kinds of atomicity?
- “true” atomicity at the CPU level - the CPU has some special atomic instructions to perform read-modify-write (RMW) operations in order to implement mutexes and the like
- “logical” atomicity done using mutexes
- only 1 thread can use a mutex at a time
- a second thread locking the mutex will be blocked
- therefore as long as all threads lock the mutex before accessing the shared state and unlock after accessing it, that will guarantee that no thread is interrupted
- even if it gets preempted!
- atomic operations cannot be interrupted and must complete before anything else is allowed to use that resource (whatever that resource happens to be)
- Deadlocks
- are when two (or more) actors are waiting for each other forever, and will therefore never wake up
- A is sleeping waiting for B to do something, and B is sleeping waiting for A to do something
- can only happen when you have 2 or more shared resources
- can never happen with a single shared resource (e.g. 1 mutex)
- there are 4 conditions necessary for deadlocks… go look at the slides
- are when two (or more) actors are waiting for each other forever, and will therefore never wake up
- IPC (Inter-Process Communication)
- how processes communicate with each other 🙃
- there are a few techniques:
- Files: 2 processes open the same file, one writes, the other reads.
- simple, widely available, synchronization handled by OS
- but not very high performance
- Pipes: a kind of virtual file (character device) where one process writes and another process reads, and the kernel forwards the data from the first process to the second
- simple, widely available, but different OSes can have pretty different APIs/behaviors for them
- better performance but still requires context switches
- Shared Memory: use virtual memory system to map the same physical page into 2 or more processes’ virtual address spaces
- set up once by the OS, then no more context switches after that - whatever one process stores, the other can load
- super high performance, but the OS and the CPU hardware both have to support it, so it’s probably not available on resource-constrained systems
- Sockets: use networking stack to open “network connections” between 2 or more processes; they’re like pipes, but with an address and protocol
- can connect to other processes on the same computer, or to other computers, so can be useful for scaling systems up to e.g. clusters or cloud computing
- requires a networking stack, which small/embedded/secure OSes may not have
- Files: 2 processes open the same file, one writes, the other reads.