Chapter 5.1. Alias Analysis
Alias analysis involves determining if two identifiers point to the same memory allocation. The task is challenging, with various options that balance precision and cost, including flow sensitivity, field sensitivity, and path sensitivity. In general, there are two main approaches to analysis: the lattice-based approach and the meet-over-all-paths (MOP) approach.
Alias Analysis Trait
RAPx provides the AliasAnalysis<T>
trait for alias analysis. The trait has two methods, which enables users to query the aliases among the arguments and return value of a function based on the function DefId
, or the aliases of all functions as a FxHashMap
. Developers can implement the trait and specify the format of the aliases based on their needs.
#![allow(unused)] fn main() { pub trait AliasAnalysis<T>: Analysis { fn get_fn_alias(&mut self, def_id: DefId) -> T; fn get_all_fn_alias(&mut self) -> FxHashMap<DefId, T>; } }
For the type parameter T
, we have a default type AAResult
, which contains a HashSet
of multiple alias relationships, and each alias relationship is recorded as AAFact
.
#![allow(unused)] fn main() { pub struct AAResult { arg_size: usize, alias_set: HashSet<AAFact>, } pub struct AAFact { pub lhs_no: usize, // parameter id; the id of return value is `0`; pub lhs_fields: Vec<usize>, // field-sensive: sequence of (sub) field numbers for `lhs_no` pub rhs_no: usize, // parameter id, which is an alias of `lhs_no`. pub rhs_fields: Vec<usize>, // field-sensive: sequence of (sub) field numbers for `rhs_no` } }
RAPx has implemented a default alias analysis algorithm based on MOP.
Default Alias Analysis
Implementation
The MOP-based alias approach is achieved via a struct MopAlias
, which implements the AliasAnalysis
trait. The detailed implementation can be found in mop.rs.
#![allow(unused)] fn main() { pub struct DefaultAlias<'tcx> { pub tcx: TyCtxt<'tcx>, pub fn_map: FxHashMap<DefId, AAResult>, } }
The results can be retrieved by decoding the data structure of AAResult
.
Features and Examples
The approach performs alias analysis for each execution path of a target function and merges the results from different paths into a final result. When encountering function calls, it recursively analyzes the callees until all dependencies are resolved. This approach is flow-sensitive and field-sensitive but context-insensitive.
Case 1
In the following code, there are four possible paths depending on the enumeration value choice
. However, only two of these paths are reachable. As a result, the return value of foo()
is an alias of the first argument x
; it cannot be an alias of the second argument y
. Such an alias analysis result can be achieved using the MOP-based approach, but not the lattice-based approach.
#![allow(unused)] fn main() { enum Selector { First, Second, } fn foo<'a>(x: &'a i32, y: &'a i32, choice: Selector) -> &'a i32 { let a = match choice { Selector::First => x, Selector::Second => y, }; match choice { Selector::First => a, Selector::Second => x, } } }
The corresponding MIR code is as follows:
#![allow(unused)] fn main() { fn foo(_1: &i32, _2: &i32, _3: Selector) -> &i32 { bb0: { StorageLive(_4); _5 = discriminant(_3); switchInt(move _5) -> [0: bb3, 1: bb1, otherwise: bb2]; } bb1: { _4 = _2; goto -> bb4; } bb2: { unreachable; } bb3: { _4 = _1; goto -> bb4; } bb4: { _6 = discriminant(_3); switchInt(move _6) -> [0: bb6, 1: bb5, otherwise: bb2]; } bb5: { _0 = _1; goto -> bb7; } bb6: { _0 = _4; goto -> bb7; } bb7: { StorageDead(_4); return; } } }
Since 0 and 1 are the identifiers for the return value and the first argument x
correspondingly, the expected result of alias analysis is (0, 1).
Case 2: field-sensitivity
In the following example, the return value of foo()
is an alias of the first field of its first argument.
#![allow(unused)] fn main() { struct Point { x: i32, y: i32, } fn foo(p1: &Point) -> &i32 { &p1.y } }
The corresponding MIR code is as follows:
#![allow(unused)] fn main() { fn foo(_1: &Point) -> &i32 { bb0: { _0 = &((*_1).1: i32); return; } } }
The alias analysis result should be (0, 1.1).
Quick Usage Guide
Developers can test the feature using the following command:
cargo rapx -alias
For example, we can apply the mop analysis to the first case, and the result is as follows:
Checking alias_mop_field...
21:50:18|RAP|INFO|: Start analysis with RAP.
21:50:18|RAP|INFO|: Alias found in Some("::boxed::{impl#0}::new"): {(0.0,1)}
21:50:18|RAP|INFO|: Alias found in Some("::foo"): {(0,1.1),(0,1.0)}
When applying the mop analysis to the first case, and the result is as follows:
Checking alias_mop_switch...
21:53:37|RAP|INFO|: Start analysis with RAP.
21:53:37|RAP|INFO|: Alias found in Some("::foo"): {(0,2),(0,1)}
21:53:37|RAP|INFO|: Alias found in Some("::boxed::{impl#0}::new"): {(0.0,1)}
To utilize the analysis results in other analytical features, developers can use mop as follows:
#![allow(unused)] fn main() { use analysis::core::alias::mop::MopAlias; // import the module MopAlias::new(tcx).start(); // new a MopAlias object with . }
The code above performs alias analysis for each function, recording the alias pairs between two arguments or between an argument and the return value.
Key Steps of Our Algorithm
There are three key steps (source code):
#![allow(unused)] fn main() { let mut mop_graph = MopGraph::new(self.tcx, def_id); mop_graph.solve_scc(); mop_graph.check(0, &mut self.fn_map); }
- Graph preparation: Construct the control-flow graph for the target function. See the source code.
- SCC shrinkage: Extract the strongly connected components (SCCs) and shrink SCCs of the control-flow graph. See the source code.
- Alias Check: Traversal the control-flow graph and perform alias analysis. See the source code
Reference
The feature is based on our SafeDrop paper, which was published in TOSEM.
@article{cui2023safedrop,
title={SafeDrop: Detecting memory deallocation bugs of rust programs via static data-flow analysis},
author={Mohan Cui, Chengjun Chen, Hui Xu, and Yangfan Zhou},
journal={ACM Transactions on Software Engineering and Methodology},
volume={32},
number={4},
pages={1--21},
year={2023},
publisher={ACM New York, NY, USA}
}