Stackify CFG
This MR implements a "Stackify" based backend for MLCFG. This allows us to transform a CFG into a structured WhyML program, simplifying proof obligations along the way. It is meant to complement the existing backend, providing more convenient VCs when the CFG satisfies certain criteria.
This algorithm is strongly inspired by the Stackify
algorithm in LLVM which is responsible for compiling LLVM CFGs into WASM (which has structured control flow). It relies on CFGs being reducible graphs. What this property tells us is that every loop in the graph has a single entry point. In turn, that means that a loop in the graph can be turned into a WhyML expression!
For example, consider the following reducible graph:
1 --> 2 -> 3
^ |
+----+
We could translate this into 1; loop { 2, 3 }
.
However, a CFG may also have branching and joining! potentially crossing loop boundaries! Reducibility guarantees a single entry point, it does not ensure a single exit. Consider the simple diamond graph:
1 --> 2 --> 4
| ^
| |
+---> 3 ----+
How can structure the control flow here? Intuitively it corresponds to the graph of an if-else
expression, and we would like something that resembles it as output.
Of course, we could always duplicate 4
into each branch, but that would quickly lead to exponential blowup with fairly simple programs.
Instead the idea behind Stackify is to introduce scopes around indirect branching. A scope is a structure you can break out of, (basically a loop that only executes once).
The previous example could be translated by Stackify as scope D {scope C { 1; if (goto_3) { break C } ; 2 }; 3 }; 4
.
With a clever implementation we can often recover the original control flow, in the previous example it is easy to see how scope C
could be inlined into the branch and the 2
wrapped in the else.
The actual algorithm works in three main passes:
- Sort the graph nodes in topological order with one restriction: when we output a loop header node, we must output only nodes it dominates until the full loop has been output. This restriction ensures that we can easily group together loop bodies.
- Insert loop and scope start / end markers
- Loop start markers are inserted just before a loop header node
- Loop end markers are inserted just after the last loop body node
- Scope start markers are inserted if the node is the nearest common dominator of the predecessors of a node reached by indirect jumping (not fallthrough).
- Scope end markers are inserted before a node reached by jumping.
- Build the expression tree and translated it into Mlw constructs.
The current implementation translates loops into recursive functions, indirect jumps into function calls, scopes are turned into let-bindings,
Note: this branch depends on !447 (merged).
TODO
-
Write extensive test suite -
Update documentation -
Add a check to verify a graph is reducible before running -
Add back support for invariants -
Explore using while
,break
andcontinue
rather than functions and let bindings. This requires labeledbreak
/continue
. -
Clean up topological sorting function. -
Remove un-needed graph functions? -
Fix code duplication when a switch
has it's fallthrough as multiple targets -
Make this an alternative translation rather than a replacement