Chapter 5.3. Control-flow Analysis
Overview
Control-flow analysis is a critical component of static analysis, enabling developers to understand the flow of a program and identify potential issues before runtime. It tracks the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated.
Among the carious tools and techniques used in this analysis, call graphs stand out as powerful representations of function invocations within a program. A call graph is a control-flow graph, which represents calling relationships between subroutines in a computer program. Each node represents a procedure and each edge (f,g) indicates the caller and callee relationship between f and g.
In our program, we provide the feature to generate static call graphs and store the result in a graph data structure implemented with adjacent list.
Quick Start
You can use the feature with the following command:
cargo rapx -callgraph
Graph APIs
To utilize the analysis results as you want, you can use our module as follows:
#![allow(unused)] fn main() { use analysis::core::call_graph::Callgraph; // import module let callgraph = Callgraph::new(tcx); // create a callgraph object callgraph.start(); // do the analysis }
get_callee_def_path
#![allow(unused)] fn main() { pub fun get_callee_def_path(&self, def_path: String) -> Option<HashSet<String>>{ ... } }
You can get all callees define path returned in a hash set with the caller define path.
Generating Call Graphs
Working with MIR (Mid-level Intermediate Representation)
MIR is Rust's intermediate representation used during compilation. It simplifies control flow by breaking down functions into a series of basic blocks, making it easier to analyze.
Our program make good use of this handy tool to help generate call graphs. We analyze what specific kind the current basic block belongs to, which represents ways of existing from a basic block.
#![allow(unused)] fn main() { pub struct Body<'tcx> { pub basic_blocks: BasicBlocks<'tcx>, .. } }
As basic_blocks exist in the struct Body, we should get Body first. There are some functions to help us with this:
- optimized_mir
- mir_for_ctfe ("stfe" stands for "Compile Time Function Evaluation")
In the case of DefKind::Const and DefKind::static, mir_for_ctfe is necessary, since rust prevents us from applying optimization to const or static ones.
Inside the BasicBlockData we can get Terminator.
#![allow(unused)] fn main() { pub enum TerminatorKind<'tcx> { Call { func: Operand<'tcx>, .. }, .. } }
Our target is to get the Call in the enum TerminatorKind. Therefore we can use the resolve function to get define id of callee functions.
With define id (i.e. def_id), it is easy to apply it to def_path_str method to get define path and construct our call graphs.
Case Study: Dynamic Trait Analysis
It is nontrivial to analyze a program applying dynamic trait. Since we have to do our analysis statically, we guarantee the analysis sound and safe. In the case of dynamic trait analysis, we give out the "maybe" answer, e.g.:
|RAP|INFO|: 8:main -> 7:(dyn trait) <* as Animal>::make_sound