Stackify CFG (!453) · Merge requests · Why3 / why3

Xavier Denis requested to merge stackify into master Jan 29, 2021

This MR implements a "Stackify" based backend for MLCFG. This allows us to transform a CFG into a structured WhyML program, simplifying proof obligations along the way. It is meant to complement the existing backend, providing more convenient VCs when the CFG satisfies certain criteria.

This algorithm is strongly inspired by the Stackify algorithm in LLVM which is responsible for compiling LLVM CFGs into WASM (which has structured control flow). It relies on CFGs being reducible graphs. What this property tells us is that every loop in the graph has a single entry point. In turn, that means that a loop in the graph can be turned into a WhyML expression! For example, consider the following reducible graph:

1 --> 2 -> 3 
      ^    |
      +----+

We could translate this into 1; loop { 2, 3 }.

However, a CFG may also have branching and joining! potentially crossing loop boundaries! Reducibility guarantees a single entry point, it does not ensure a single exit. Consider the simple diamond graph:

1 --> 2 --> 4
|           ^
|           |
+---> 3 ----+

How can structure the control flow here? Intuitively it corresponds to the graph of an if-else expression, and we would like something that resembles it as output. Of course, we could always duplicate 4 into each branch, but that would quickly lead to exponential blowup with fairly simple programs. Instead the idea behind Stackify is to introduce scopes around indirect branching. A scope is a structure you can break out of, (basically a loop that only executes once). The previous example could be translated by Stackify as scope D {scope C { 1; if (goto_3) { break C } ; 2 }; 3 }; 4. With a clever implementation we can often recover the original control flow, in the previous example it is easy to see how scope C could be inlined into the branch and the 2 wrapped in the else.

The actual algorithm works in three main passes:

Sort the graph nodes in topological order with one restriction: when we output a loop header node, we must output only nodes it dominates until the full loop has been output. This restriction ensures that we can easily group together loop bodies.
Insert loop and scope start / end markers

Loop start markers are inserted just before a loop header node
Loop end markers are inserted just after the last loop body node
Scope start markers are inserted if the node is the nearest common dominator of the predecessors of a node reached by indirect jumping (not fallthrough).
Scope end markers are inserted before a node reached by jumping.

Build the expression tree and translated it into Mlw constructs.

The current implementation translates loops into recursive functions, indirect jumps into function calls, scopes are turned into let-bindings,

Note: this branch depends on !447 (merged).

TODO

Write extensive test suite
Update documentation
Add a check to verify a graph is reducible before running
Add back support for invariants
Explore using while, break and continue rather than functions and let bindings. This requires labeled break / continue.
Clean up topological sorting function.
Remove un-needed graph functions?
Fix code duplication when a switch has it's fallthrough as multiple targets
Make this an alternative translation rather than a replacement

Edited Jan 13, 2022 by Xavier Denis

Stackify CFG

Merge request reports