Announcements

• Last lectuuuuuuuuuuure yaaaaaaaaay
• Final will be normal day/time/room next week (Tuesday at 6:00, here)
• Material only since first exam PLUS BACKTRACKING
• Since that was like “the last topic” and you did that project after exam 1
• We’ll have a little review beforehand as before
• Similar size/length/questions to first exam

Tree traversal

• Let’s say we want to print out the contents of a tree
• Say, a tree that represents an expression ;)
• What order do we print the nodes out?
• Is there a “right” order?
• With a list of items, it really only makes sense to go from one end to the other, but…
• With an expression tree, well.
• We could print out the operator before the children; between them; or after them.
• These are called preorder, inorder, and postorder, respectively.
• What is the runtime of a tree traversal?
• How many nodes are we looking at?
• All of them!
• So it’s $O(n)$.
• We could use this to turn the tree into a list, too.
• We could append the value of each node to the list in whatever order (pre-, in-, post-) we want, to get different list representations
• You pass the list as an argument to the recursive visit functions.
• Pre-/in-/post-order iterations are what we call depth-first
• At each node, we explore the subtree of each child fully.
• That is, when we do the recursive call for the left child, all the descendants of the left child are visited before that recursive call returns.
• But there’s another way: breadth-first
• In this traversal strategy, we visit all the direct children of a node before looking at any of their children.
• Think of it like exploring a cave or a maze.
• You could go down the deepest tunnels.. and then when you get to the end, back up a bit and try a different path.
• That’s depth-first.
• Or, you could go a little bit down all the paths, and make notes of where each one splits off, before going a little further down each one.
• You might also call breadth-first traversal: level-order traversal
• Because the levels are visited in order.
• First the root; then all the children of the root; then all the grandchildren of the root…
• But how would we write breadth-first traversal recursively?
• ???
• Like, you’re on the left side, and then you want to jump to the right side…
• how????
• Well, you can’t, really.
• We need something else. Something new.

Queues

• A queue (pronounced like the letter Q) is sort of the companion of stacks.
• Stacks are LIFO - Last In, First Out.
• Queues are FIFO - First In, First Out.
• Think of the line at the checkout at a store.
• That’s a queue.
• Three main operations:
• enqueue(E) - put an item at the back of the queue
• E dequeue() - get the item from the front of the queue
• E peek() - peek at the item at the front of the queue
• Overall very similar to a stack!
• We can emulate a stack with a list… how about a queue?
• Add items to one end (end?)
• Remove from the other (beginning?)
• But we’d like these operations to be fast…
• Is adding/removing items to beginning/end of list guaranteed to be fast?
• Not for an array implementation, no.
• Adding/removing the beginning of an array list is $O(n)$.
• Implementing a queue…
• Simplest way would be a linked list
• Think about how to implement enqueue()
• Where do we put it?
• What do we have to keep track of?
• Now dequeue()
• Where do we take the item?
• What else do we have to keep track of?
• But there’s a problem with this.
• Every time we enqueue something, we allocate a new node.
• Every time we dequeue something, we let that old node be garbage collected.
• It might not seem like it, but this is a huge waste of time
• new is not free!
• Its runtime complexity is not really visible to you, but it does take time. Sometimes a lot of time.
• And everything you new eventually has to be swept up by the garbage collector…
• And THAT takes even more time.
• It seems like a waste. Why not reuse those nodes?
• Keeping nodes around for a rainy day
• If we treat the nodes as a sort of valuable resource, we can keep a second list of ones to be reused
• When we dequeue, instead of just letting that node float off…
• We put it on the free list.
• Then, when we enqueue…
• First, we check if there’s a node on the free list.
• If so, remove it and tack it onto the end of the queue.
• If not, fall back to new.
• This is a powerful technique when you have a collection of small objects that need to be constantly created and discarded
• If your data structure only needs to hold at most n items at any given time, then after a little “warming up,” you will not need to allocate memory anymore!
• You could implement a linked stack or list or bag or set or whatever the same way.
• Array Queues
• With an array bag or stack or whatever, how did we represent the items “in” the collection?
• We kept a size
• And all the array indices < size were “in” the collection
• Let’s do that.
• When we enqueue, it’s the same as before:
• Check the capacity
• Put the item at index size
• Increment size
• But when we dequeue…
• We remove from the front
• So now we have this gap. How’d we handle that with lists?
• We moved all the things after it left a slot
• But this is like, the perpetual worst case: removing the first item
• So this is a non-starter.
• But hey. size is essentially tracking the last item in the queue.
• What if we also tracked the first item in the queue?
• In an array queue, we keep track of where the queue starts and ends within the array.
• Enqueue puts the thing at the end and moves the end up by 1.
• Dequeue removes the thing at the start and moves the start up by 1.
• But… what happens when you go off the end of the array?
• Well, if all the array slots are full, you increase the capacity.
• But what about if your array’s capacity is 100, and you only have 12 things in the queue?
• The next free slot is…
• At the beginning!
• What we do is wrap around
• We can use the modulo operator to do this “wrapping around” behavior
• end = (end + 1) % _contents.length;
• Modulo gives us the remainder of the division.
• This ensures that when we go off the end, it wraps back around to 0.
• Believe it or not, that’s about the only change we have to make.
• As long as we keep track of the size as well, this Just Works.
• We’ll also have to handle resizing slightly differently.
• You might also see this referred to as a ring buffer - cause it’s ring-shaped.
• So which is better?
• Array queues can have smaller space and time overhead.
• But asymptotically, they’re about the same.
• So whatever floats your boat.

Back to level-order traversal…

• With a queue, level-order traversal becomes super easy.
1. Put the root node in the queue.
2. While the queue is not empty:
1. Dequeue a node.
2. Visit it.
3. Enqueue any of its children.
• It might not seem like this will work, but it totally does.
• We’ll dequeue the root, visit it, and add its children (the second level).
• The next step, we dequeue a second level node, and enqueue its children (third level).
• We continue dequeuing second level nodes and enqueuing third level nodes.
• Then we’ll get to the third level nodes and start enqueuing fourth level ones… etc.
• Interestingly, if you replace the queue with a stack:
• You get a preorder traversal!
• (well, assuming you push the children in right-to-left order.)
• Remember that a recursive algorithm has an implicit stack: the call stack used to remember which functions are “in progress.”
• (it’s possible to make iterative inorder and postorder traversals too, just not quite as straightforward.)
• If, instead, we used a List…
• And instead of removing to dequeue, we just kept an index into the list as the “next node to visit…”
• We would now have an algorithm to turn a tree into a list/array in level order.
• That seems like “okay, so what?” but…
• Remember those full and complete binary trees?

Representing trees as arrays

• Array queues give us some performance and space benefits
• Array trees can too!
• Here’s the idea:
• We store all the nodes of the tree in level order in an array
• But the tree must be complete (and “full” is a subset of complete)
• By doing this, we get this really neat/weird relationship between the array indices and a node’s children and parent.
• Let’s say you have an array like this.
• [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
• Let’s say that this is the level-order array representation of a binary tree rooted at 0.
• 1 and 2 are its children;
• 3 and 4 are 1’s children; 5 and 6 are 2’s children…
• and so on.
• The indices of these slots also go 0, 1, 2, 3, 4…
• Let’s pick a node, say 6. It appears at index 6.
• Its children are 13 and 14.
• Its parent is 2.
• Its children are 11 and 12.
• Its parent is 2.
• Its children are 5 and 6.
• Its parent is 0.
• Do you notice any patterns here?
• In a level-order array representation of a binary tree:
• The children of a node at index $i$ are at $2i + 1$ and $2i + 2$
• The parent of a node at index $i$ is at $\lfloor(i - 1) ÷ 2\rfloor$
• $\lfloor\text{this symbol}\rfloor$ is “floor” and it means “round down to the nearest integer.”
• This is super convenient!
• No more pointers to nodes!
• Level order traversal is just iterating from 0 to n!
• Arrays are fast!
• Woo!
• But…
• Does this work for incomplete trees?
• It can, but it will mean having “holes” in your array.
• You might have null in array slots where there is no node.
• These holes might waste a lot of space in a tree that’s far from complete.
• These holes will make level-order traversal slightly trickier.

Binary Search Trees

• A binary search tree is a special case of a binary tree
• It’s like how an ordered array is a special case of an array
• In a BST, the values have an order, and for every node in the tree:
• All its left descendants have values less than it.
• All its right descendants have values greater than it.
• (If you want to allow duplicates, you might say “greater than or equal” but… duplicates often aren’t useful in BSTs.)
• What methods might we have?
• boolean add(E) - true if addition succeeded
• boolean remove(E) - true if removal succeeded
• boolean contains(E) - true if value exists in BST
• boolean isEmpty(), int size() - we know these.
• Let’s think about the contains() method.
• Remember binary search?
• This is basically that.
• If the searched value == the node’s value, return true.
• Else if the searched value < the node’s value, search left.
• Else, search right.
• In fact, we can write this recursively or iteratively (since there’s only a single recursion).
• Essentially, we need to find the place where the new value belongs.
• So we do a search for the value.
• If we find it already in the BST, return false.
• Otherwise, we will hit a null dead end; make a new node and put the value there, returning true.
• Again, this can be done recursively or iteratively.
• Oof..
• So there are actually 3 cases: 2 easy ones, and 1 hard one which we will transform into an easier one.
• Case 1: you want to remove a node n with no children (a leaf node).
• In that case, the parent’s child link to that node is set to null. done.
• Case 2: you want to remove a node n with one child.
• We want to maintain the BST property (that each node’s children are in order).
• n’s child will either be less or greater than n
• But n and all its descendants will be on “one side” of n’s parent.
• So, we do like we’re removing a link in a linked list: we replace the parent’s link to n with a link to its one child.
• Case 3: you want to remove a node n with two children.
• This is tricky. We need to maintain the BST property… but we’ve got these 2 loose nodes and what gets attached to what??
• Well, we can transform this case into a simpler case.
• Leaf nodes and nodes with 1 child are easier to remove.
• So why not find one of those that can replace n?
• Then we can remove that easier-to-remove one.
• The issue is: WHICH node should take n’s place?
• We have to maintain the property that everything to the left is less, and everything to the right is greater…
• So we need a value bigger than all the other things to the left (or smaller than all the other things to the right)
• We call these the inorder predecessor/successor.
• That is, if we were to perform an inorder traversal, these would be the values on either side of n.
• Finding these values is actually really easy:
• The inorder predecessor is the left subtree’s furthest right child.
• That is, go to n.left and then follow .right links until you get to the bottom.
• The inorder successor is the right subtree’s furthest left child.
• Now we can remove that other node and replace n’s value with the removed node’s value.
• HOO.

BST Analysis

• add, remove, and contains all start with a similar algorithm: find a value.
• how many steps does that take?
• well, let’s think about how long it’d take in the worst case.
• what would be a really crappy BST?
• a BSS: a binary search stick.
• in other words: a “tree” where you only have left nodes or only have right nodes.
• it becomes a linked list.
• so in the worst case, the value you want is at the end of the stick.
• so, $O(n)$.
• what about in the best case?
• it’s the root, so $O(1)$.
• well, that’s not too interesting.
• what about in the uh… worst case but in the best kind of tree?
• what’s the best, most compact kind of tree?
• a complete tree. (or full if you’re lucky)
• in that case, what’s the longest path from the root to any value?
• uh…
• well how many levels are there?
• $log(n)$.
• there we go.
• All these methods are $O(h)$, where $h$ is the height of the tree…
• …but that height can range from $log(n)$ (good!) to $n$ (bad.)
• this brings up an important property of BSTs: how balanced they are.
• a balanced BST is as short as it possibly can be.
• you might also say that each node is the median of it and its descendants.
• when adding/removing arbitrary values to a BST…
• you aren’t guaranteed to have a balanced tree, but it often kinda works out that way.
• there are, however, more advanced versions of binary trees that maintain balance (AVL, red-black, B+). these are like, 1501 or 1510 ones ;)

Speaking of which

THAT’S THE END OF THIS CLASS WOO