## Announcements

• Come up when I call the FIRST LETTER OF YOUR USERNAME
• Average Case vs. Amortized analysis
• Both are kinda sorta averaging things out but… they’re used in different circumstances
• Amortized analysis is used when the runtime fluctuates based on n.
• Average case analysis can be used on any algorithm, and requires that you have a probability distribution of possible inputs.
• Say, “how often each item in a list is accessed”
• OMETs are up
• Leave a review
• It’s anonymous
• Proj5 out shortly.

## One last thing about Lists

• Let’s say you want to use a list where you insert/remove things from the front a lot.
• You’d think “ok, I should use a linked list instead of an array list.”
• You’d think that, wouldn’t you.
• The thing is…
• Modern computers are really really good at handling arrays.
• It has to do with how the data is “clustered” in memory.
• With an array, all the items are adjacent to each other. This is good.
• With a linked list, the items might be all over the place.
• Think of it like a library. What’s faster:
• having all the books you need on one little shelf next to each other?
• or having to run all over the library to find all the books you need?
• This is one case where big-O notation can lead you astray.
• Those multiplicative constants can sometimes matter a lot.
• A linked list never has to be resized the way an array does.
• But the time you spend hopping all over memory in a linked list can outweigh the time you spend resizing an array!
• It’s just one more fun variable to include in your decision-making process!

## Iterators

• Iteration means “going over the items of something”
• (Or sometimes just “doing something over and over”)
• How do you iterate over the items of an array?
  for(int i = 0; i < a.length; i++) {
item = a[i];
}

• What’s the runtime of this?
• The loop is run n times
• Accessing the array is $O(1)$
• So it’s overall $O(n)$
• OK, let’s naively try using this same approach for a linked list.
  for(int i = 0; i < list.size(); i++) {
item = list.get(i);
}

• The loop is still run n times
• But accessing the ith item of the list is $O(n)$
• So it’s overall $O(n^2)$!
• We’re also going to get into non-linear data structures today
• How do you iterate over all the items in a thing that doesn’t have an “order” like a list?
• Let’s say we have a linked list.
• We know that we can iterate over an array in $O(n)$ time.
• So… why don’t we turn the linked list into an array?
  a = new whatever[list.size()];
int i = 0;
for(n = head; n != null; n = n.next) {
a[i] = n;
i++;
}

• turning a linked list into an array is $O(n)$
• then we can iterate over the array in $O(n)$
• and $O(n) + O(n) = O(n)$, so yay!
• But…
• This also requires $O(n)$ space
• And what if the client is doing a search for something that’s at item 3 of a 1,000,000 item list?
• hm.
• A new operation: Sequential access
• A “loop over all items in a data structure” is an INCREDIBLY common operation
• Doing it as a sequence of “get item i” operations is not ideal for all data structures
• So let’s make iteration (sequential access) its own Thing

## Java Iterators

• A Java Iterator is an object which acts as a “helper” to iterate over some data structure
• It lets us “remember” where we are in the sequence
• In Java there is an interface Iterator<T> with these methods:
• boolean hasNext() says whether or not the loop should continue
• T next() gets the next item and moves ahead by 1 item
• void remove() removes the last item that was returned by next()
• can only remove 1 item after each next() call
• not strictly necessary - removal is not always desirable or simple to implement!
• Thinking about a regular for loop…
• we remember where we are with the loop counter i
• the hasNext functionality is the loop condition: i < a.length
• the next functionality is the increment step: i++
• The generic for loop
• Java has a second kind of for loop that looks like this:
• for(T item : someCollection)
• where T is whatever type the values are
• You can use these with regular arrays too

  int[] a = new int[]{ 1, 2, 3, 4, 5 };
for(int value : a) { ... }

• But how does it woooork?
• To be able to use a generic for loop on something, it must implement Iterable<T>
• It has one (important) method:
• Iterator<T> iterator() returns a new iterator object.
• As long as it implements this interface, an object can appear on the right side of the colon in the generic for loop.
• (Arrays are a bit of a special case :P)

## Implementing an Iterator

• Let’s consider our old friend, ArrayBag
• Ex24Iterators.java contains a minimalist implementation of ArrayBag
• This time, it implements Iterable and contains a nested class, ArrayBagIterator which implements Iterator
• Notice that this class is not static!
• This means the nested class can access the outer class’s members.
• Really, it does this by having a hidden field which points to the instance of ArrayBag which created it.
• So there’s an implicit “link” between each ArrayBagIterator and an ArrayBag
• ArrayBagIterator has one field to keep track of where we are, _i.
• It’s just like a loop counter.
• hasNext() is like the loop condition.
• next() has an extra complication: we have to make sure we don’t go off the end.
• In a generic for loop, this should never happen..
• But if someone is using the iterator object directly, they could call next() after hasNext() already returns false.
• It’s an awful lot of code for something so simple.
• Welcome to the wonderful world of Java boilerplate. :P
• This language feels like filling out tax forms, sometimes.
• Let’s think about how we would implement remove()
• when can we call remove()?
• after calling next()
• …but only once
• so maybe we need a field to keep track of whether or not the user is allowed to call remove()
• how do we remove the current item?
• well we have removeEntry(i) which is perfect for this situation
• but after removing the item, we copied the last item into position _i
• so if we call next() again, we’ll skip that item
• so maybe we need to decrement _i so that on the next next(), it’ll see that item.
• sheesh.
• all this extra complexity is often not worth it.
• then consider the case where you have two iterators iterating over the same bag at once (like in a nested loop)
• if one removes something, what should happen to the other?
• AAAAAAAAAAAAAAAAAA
• it’s totally fine if you don’t implement remove() and leave it unsupported.
• One last point
• The Iterable interface is not very flexible. It only gives you one way to iterate
• What if you have a List and want to iterate backwards?
• Actually, it’s not hard. Just have a method Iterator<E> reverseIterator() which returns a different kind of iterator object
• But you will have to call it explicitly:
• for(T item : list.reverseIterator()) ...

## Trees

How appropriate for a snowy day approaching Christmas! 🎄🌨❄️☃

• So far we’ve only looked at data structures which are zero- or one-dimensional
• Bags and Sets are “zero-dimensional” cause they are unordered; they’re an amorphous “blob”
• Stacks, Arrays and Lists are “one-dimensional” cause, well, they’re a linear order
• I mean you can have 2D arrays, but that’s not really what I’m getting at :P
• But a common way of organizing things is hierarchically (here-ARK-ick-al-ly)
• The files on your computer are in folders, and folders can contain other folders
• Classes in Java form an inheritance hierarchy where each class has a “parent” class, except for Object which is the “root”
• Family trees are… mostly hierarchical :P
• Languages of all kinds (human, programming, mathematical) have a tree-like structure
• Consider an expression like 2 * x^2 + 9 * y ÷ 4
• If you were to write this with “full” parentheses, it would be (2 * (x^2)) + ((9 * y) ÷ 4)
• Now you can see that either side of each operator is either an operand or another operator
• (Did anyone ever diagram sentences in English class…? Is that a thing still?)
• A tree is a kind of linked data structure composed of nodes. (Sound familiar?)
• (A linked list is a tree too, just… a very skinny tree. A stick?)
• Each node can point to two or more other nodes, giving a branching structure (like a real tree).
• Trees cannot have cycles and cannot have two or more arrows pointing to one node.
• Terminology
• Each node has a value, like in a linked list.
• Children are the nodes “below” a node (the ones it links to).
• The parent is the node “above” a node (the one that links to it).
• The root is the “top” node (the one that has no parent).
• e.g. Object is the root of the Java inheritance hierarchy.
• Internal nodes have at least one child but are not the root.
• Leaf nodes have no children.
• Siblings are nodes with the same parent.
• Descendants are all the nodes under a certain node (its children and their children, recursively).
• Ancestors are all the parent nodes from one node to the root.
• Trees have a height, which is how many “levels” there are
• i.e. how many “links” we have to follow from the root to the “furthest” or “deepest” leaf
• This is an important property as many tree algorithms take time proportional to the height
• It’s possible for the height of an n-node tree to be very different
• If all the nodes are stacked up in a line, the height is n!
• This is what a linked list is!
• If all the nodes are evenly “spread out”, it can be… well…. hm.
• (It’s logarithmic.)
• A special kind of tree is a binary tree
• Each node has at most 2 children.
• It has nothing to do with binary numbers… remember binary search?
• Each “level” in the tree has a power of 2 nodes (1, 2, 4, 8, 16…)
• Remember when we talked about recursion trees?
• In merge sort, for n items, we had… how many levels?
• $log(n)$.
• That’s the height of the recursion tree.
• And that’s why merge sort was fast - because that tree had $log(n)$ levels.
• Full and Complete binary trees
• A full binary tree has every level filled
• So if it’s 5 levels deep, it will have 1 + 2 + 4 + 8 + 16 = 31 items
• That happens to be $2^5 - 1$
• A complete binary tree has almost every level filled…
• And the deepest level has nodes filled in from left to right
• Why do we care?
• Well, we can actually ditch the links and store these trees as arrays instead.
• We’ll see that next week…

## Trees and Recursion (a love story)

• Let’s say I have a tree of numbers.
• I want to sum up all of these numbers.
• How would I do it in a loop?
• We’d start at the root, I guess.
• And then uh… loop over the children?
• And then???????
• This is so tricky because loops are essentially one-dimensional
• They’re a poor fit for a tree!
• Recursion really, really shines when paired with trees.
• With recursion, it’s very natural to branch into multiple dimensions.
• Essentially, we make the recursion tree match the actual tree!
• What is the sum of an “empty” tree?
• Like, one with 0 nodes. Not even a root.
• Well, let’s say it’s 0.
• If we have a non-empty tree…
• Well, the current node has a value.
• And we have to add that to… whatever its left sub-tree’s sum is.
• And whatever its right sub-tree’s sum is.
• So here’s the function:

  int treeSum(Node n) {
if(n == null) {
return 0;
} else {
return n.value + treeSum(n.left) + treeSum(n.right);
}
}

• What if we wanted to see if a tree contains a value?
• An empty tree does not contain anything, so return false.
• Otherwise, either the current node has the value…
• or its left sub-tree does…
• or its right sub-tree does.
• So:
  boolean treeContains(Node n, E value) {
if(n == null) {
return false;
} else {
// || does "short-circuit" evaluation, so
// this will only go as far as it needs to.
return n.value.compareTo(value) == 0
|| treeContains(n.left, value)
|| treeContains(n.right, value);
}
}

• Does this algorithm seem familiar…?
• What if the values were ordered?
• Feels awfully binary-search-y…
• YUP
• NEXT TIME