• recall:
• 1 function = 1 CFG
• 1 CFG = 1 or more BBs in a graph
• local analysis rewrites the IR instructions within a BB

## Flow Analysis

• in this code, can we ever run the else?
``````  fn uhoh() {
if true {
println_s("then!");
} else {
println_s("else!");
}
}
``````
• not necessarily… constant conditions do come up, e.g.
``````  const FEATURE_ENABLED = false;
// ...
if FEATURE_ENABLED { ... } else { ... }
``````
• can we detect this in the AST?
• maybe, but more complex conditions would require implementing constant folding on the AST, which seems like a duplication of effort (since we talked about doing it to the IR last time)
• this is a problem best solved with flow analysis
• statically analyzing what parts of the CFG are reachable (another reachability oh boy!)
1. mark all BBs “unvisited.”
2. do a depth-first traversal of the CFG, starting at the entry bb (bb0):
1. mark it as visited.
2. for each successor, if it hasn’t been visited, recursively traverse it.
3. at the end, any unvisited BBs left are unreachable.
• if this feels like reachability from GC… good, cause it basically is.
• this algorithm has an important property: it terminates
• that is, it can never get stuck in an infinite loop, no matter how complex the CFG is
• how do we know this? the intuitive proof is something like this:
• every node can only be in one of two sets (visited and unvisited).
• we only look at unvisited notes.
• a node can only move from unvisited to visited.
• therefore, the set of nodes we look at monotonically shrinks.
• so, the maximum number of steps (node-visits) in this algorithm is bounded above by the number of nodes in the CFG.
• okay. so does this solve the original problem?
• no, but it wouldn’t be hard to modify this algorithm so it does.
• on the step that says “for each successor,” we change that to “for each successor that we can prove will be run”
• for `if true {} else {}`, we know the else will never be run, so we only visit the “then” side.
• boom! solved. right?
• `if 2 + 2 == 4 {} else {}`
• this is logically the same, but the IR looks something like:
``````  \$t0 = 2 + 2
if \$t0 == 4 then goto ... else goto ...
``````
• we can do constant folding to get `\$t0 = 4`
• and then copy propagation to get `if 4 == 4`
• and that’s pretty obviously true.

## From local to global

• global optimization goes beyond single BBs and works on the CFG as a whole
• the local optimizations we talked about last time can assist global optimization
• what’s more, those local optimizations can be extended across BB boundaries
``````  fn foo1(a: bool): int {
let ret = 0;
if a { println_s("whee"); }
return ret;
}
``````
• looking at the CFG for this function, the `ret = 0` and `return ret` occur in different BBs
• but intuitively, `ret` is acting as a constant here
• we should be able to copy-propagate to get `return 0`
• and dead-code-eliminate to delete the `ret = 0` instruction
• can we always do this? well:
``````  fn foo2(a: bool): int {
let ret = 0;
if a { ret = 10; }
return ret;
}
``````
• uh oh. one path has `ret == 0`, and another has `ret == 10`.
• we probably don’t know what value `a` will be until runtime (`foo(user_input())`)
• so if there are multiple paths from definition to use, we can’t copy-propagate.
• …or can we??
``````  fn foo3(a: bool): int {
let ret = 0;
if a { ret = 10; } else { ret = 10; }
return ret;
}
``````
• multiple paths, but each path assigns it the same value.
• aaaAAA
• when you start to get a bunch of special cases like this, it’s a good idea to step back and think about the algorithm a different way.

## Trying to formalize global constant copy propagation

• a use of a variable means we get its value
• these call count as uses of `x`: `y = x`, `y = x + 5`, `if x < 10...`, `f(x)`
• global constant copy propagation works like this:
• on every path to a use of `x`, the last assignment to `x` is `x = C`, for some constant value `C`.
• this condition captures the idea of all the examples above:
• in `foo1`, there is only one assignment to `ret`, so the condition is true.
• in `foo2`, there are two assignments to `ret`, but the constant is different in each, so the condition is false.
• in `foo3`, again there are two assignments, but the constant is the same in both, so the condition is true.
• okay, cool, but how do we implement this?
• let’s look at it at a micro level first
• let’s say we don’t know what value `x` could be. then we see `x = 50`.
• well, now we know that `x` is 50. duh.
• okay. the next instruction says `x = x * y`.
• oh, shoot. what’s `y`? dunno.
• so we’re back to not knowing what `x` is.
• remember: programs are proofs.
• and just like proofs, each step can prove or disprove some proposition.
• so we’ll just do it one step at a time.
• remember predecessors and successors?

## Actually formalizing global constant copy propagation

• we’ll define these rules on the level of instructions within a block…
• but you can generalize it to a BB by repeatedly applying the rules to its instructions in order.
• each instruction can change our knowledge of what `x` contains.
• `x = C` for some constant `C` means that next instruction, `x` is a constant.
• doesn’t matter what was in `x` before, it’s definitely a constant now.
• `x = y` or `x = y op z` means that next instruction, we don’t know what’s in `x`.
• doesn’t matter if it was a constant before, now we can’t tell what it is.
• any instruction that doesn’t modify x has no effect on that knowledge.
• so for `y = 10`, if `x` was `C` before, it’ll still be `C` after.
``````  // x = ANY (that is, it could be any value)
x = 10
// x = C(10)
x = 11
// x = C(11)
y = 20
// x = C(11)
x = f()
// x = ANY
``````
• that’s fine for straight-line code within a block. what about when you get edges involved?
• to determine the knowledge of `x` for the first instruction of a block…
• you have to look at that block’s predecessors.
• consider this code:
``````  if cond { x = 10; } else { x = 20; }
return x;
``````
• we have a diamond-shaped CFG with two edges pointing at the `return x;` block.
• on one edge, we know that `x = C(10)`.
• on the other, we know that `x = C(20)`.
• so, before the `return x;`, we (sadly) have to say `x = ANY` once again.
• however if both the “then” and “else” sides did `x = 10`
• then we’d have `x = C(10)` coming from both paths.
• in that case, we would be able to say `x = C(10)` before the `return`!
• so these intuitions lead us to these rules:
1. if any predecessor says `x = ANY`, then we have to assume `x = ANY`.
2. if all predecessors say `x = C(..)`
• if all of them say it’s the SAME constant, then we can assume it’s that constant.
• otherwise, we have to assume `x = ANY`.
• now we have everything we need to implement a naive algorithm:
1. assume `x = ANY` for the input of all instructions in all blocks.
2. then, do a breadth-first traversal starting at bb0:
1. apply the block-level rules to determine the value of `x` at the start of that block.
• (the entry block bb0 is a special case.)
2. apply the instruction-level rules to determine the value of `x` at the end of that block.
• boom. let’s try it on the above code first.
• then, let’s try it on this:
``````  fn evil() {
let ret = 10;
for i in 0, 10 { print_s("ha"); }
return ret
}
``````
• what’s wrong?
• the `if i < 10` block has two predecessors, but when we visit it, we don’t know what value `ret` is from one of them.
• worse, if we follow the predecessors backwards, we get back to this same block.
• I TOLD YOU ABOUT CYCLIC GRAPHS, BRO
• this is (fortunately) not too hard to solve.

## The improved algorithm

• our knowledge of `x` can now be one of three options:
• `x = ANY` and `x = C(..)` like before, and
• `x = UNVISITED` as an “initial” value.
• our instruction-level rules now look like this:
• for the instruction `x = C`, the output will be `x = C` regardless of input.
• for the instruction `x = anything else`, the output will be `x = ANY` regardless of input.
• for all other instructions, the output will be the input.
• and our block-level rules look like this:
1. if any predecessor says `x = ANY`, then we have to assume `x = ANY`.
2. if all predecessors say `x = UNVISITED`, then assume `x = UNVISITED`.
3. if all predecessors say `x = UNVISITED` or `x = C(..)`
• we can ignore the `UNVISITED` predecessors, and apply the constant rule as before (if the constants are all the same, then output will be `x = C(that constant)`, otherwise output will be `x = ANY`.
• now when we run this algorithm on that loop code…
• we actually visit some blocks/instructions more than once!!
• yep! that’s fine.
• we can still prove this algorithm terminates.
• there are still a finite number of states our knowledge of `x` can be in.
• those states change monotonically (`UNVISITED -> const, const -> ANY, UNVISITED -> ANY`)
• so eventually, everything will “settle” on a final value.
``````  fn good(): int {
let ret = 10;
for i in 0, 10 { ret = 20; }
return ret;
}
``````

• let’s look at a harder problem:
``````  let some_global = true;
fn foo(): int {
let ret = 0;
if some_global { ret = 10; }
return ret;
}
``````
• maybe `some_global` is “a constant” if nobody ever assigns into it… but how would I prove that?
• I’d have to look at the whole program
• which is a whole other layer of optimization on top of global opt