Chapter 5.6. Range Analysis in Rust

Range analysis is a type of static analysis used to track the range (interval) of possible values that a variable can take during the execution of a program. By maintaining an upper and lower bound for each variable, the analysis helps optimize code by narrowing the values that a program may handle. In Rust, range analysis is particularly important in verifying safety properties, such as ensuring that an array access is always within bounds or that certain calculations do not result in overflows.

Range Analysis Trait

#![allow(unused)]
fn main() {
pub trait RangeAnalysis<'tcx, T: IntervalArithmetic + ConstConvert + Debug>: Analysis {
    fn get_fn_range(&self, def_id: DefId) -> Option<RAResult<'tcx, T>>;
    fn get_fn_ranges_percall(
        &self,
        def_id: DefId,
    ) -> Option<Vec<RAResult<'tcx, T>>>;
    fn get_all_fn_ranges(&self) -> RAResultMap<'tcx, T>;
    fn get_all_fn_ranges_percall(&self) -> RAVecResultMap<'tcx, T>;
    fn get_fn_local_range(
        &self,
        def_id: DefId,
        local: Place<'tcx>,
    ) -> Option<Range<T>>;
    fn get_fn_path_constraints(
        &self,
        def_id: DefId,
    ) -> Option<PathConstraint<'tcx>>;
    fn get_all_path_constraints(&self) -> PathConstraintMap<'tcx>;
}
}

Quick Usage Guide

To test the feature via terminal command:

cargo rapx -range

or show the entire analysis process via terminal command:

RAP_LOG=trace cargo rapx -range

To use the feature in Rust code:

#![allow(unused)]
fn main() {
let mut range_analysis = RangeAnalyzer::<i128>::new(self.tcx, false);
range_analysis.run();
let path_constraint = range_analysis.get_all_path_constraints();
rap_info!("{}", PathConstraintMapWrapper(path_constraint));
let result = analyzer.get_all_fn_ranges();
}

Supported Integer Types For Ranges:

i32, i64, i128
u32, u64, u128
usize

Default Implementation

RAPx provides a default implementation of RangeAnalysis trait in range_analysis.rs. The implementation is inspired by the following CGO paper.

Raphael Ernani Rodrigues, Victor Hugo Sperle Campos, and Fernando Magno Quintao Pereira. "A fast and low-overhead technique to secure programs against integer overflows." In Proceedings of the 2013 IEEE/ACM international symposium on code generation and optimization (CGO), pp. 1-11. IEEE, 2013.

#![allow(unused)]
fn main() {
pub struct RangeAnalyzer<'tcx, T: IntervalArithmetic + ConstConvert + Debug> {
    pub tcx: TyCtxt<'tcx>,
    pub debug: bool,
    pub ssa_def_id: Option<DefId>,
    pub essa_def_id: Option<DefId>,
    pub final_vars: FxHashMap<DefId, HashMap<Local, Range<T>>>,
    pub ssa_locals_mapping: FxHashMap<DefId, HashMap<Local, HashSet<Local>>>,
}
}

Why SSA Form?

Before performing range analysis, the MIR (Mid-level Intermediate Representation) is first transformed into Static Single Assignment (SSA) form. SSA guarantees that each variable is assigned exactly once, and every use of a variable refers to a unique definition. This transformation simplifies the analysis in several ways:

It makes data flow explicit, allowing the analysis to accurately track how values propagate through the program.
It allows precise modeling of control flow joins using phi-like constructs.
It improves precision by allowing the interval of each version of a variable to be analyzed separately after each assignment.

Don’t worry about losing track of variables after SSA transformation: The DefaultRange maintains a mapping ssa_locals_mapping, which is a HashMap from the original MIR Local variables to their corresponding SSA Locals. This ensures that even after SSA conversion, you can still query the intervals for the variables you care about using their original MIR identity. The SSA transformation is essential for sound and precise interval analysis and is a foundational preprocessing step in this system.

Range Analysis Features

Flow Sensitivity

Interval analysis in Rust can be flow-sensitive, meaning that it accounts for the different execution paths a program might take. This allows the analysis to track how intervals change as variables are assigned or modified during the execution flow, improving the precision of analysis.

Lattice-based Approach

In this approach, values of variables are represented in a lattice, where each element represents an interval. A lattice ensures that each combination of intervals has a defined result, and merging different paths of a program is done by taking the least upper bound (LUB) of intervals from each path.

For example, if a variable x can have an interval [0, 10] on one path and [5, 15] on another path, the merged interval would be [0, 15] because that represents the union of both possible value ranges.

Meet-over-all-paths (MOP) Approach

In the meet-over-all-paths (MOP) approach, the analysis is performed by considering every possible path through a program and merging the results into a final interval. This approach is path-sensitive but may be less scalable on large programs because it needs to account for all paths explicitly.

Precise Interprocedural Analysis

Although each callee function is analyzed once globally for performance,Every call site (i.e., function invocation) triggers a separate numeric evaluation using the actual arguments passed in.

This hybrid approach preserves analysis precision without sacrificing performance.

Case Study: Array Bound Checking

Consider the case where interval analysis can be used to check array bounds. Rust guarantees memory safety through strict bounds checking, but interval analysis can help eliminate redundant checks or prevent unsafe accesses during compile-time.

#![allow(unused)]
fn main() {
fn get_element(arr: &[i32], index: usize) -> Option<i32> {
    if index < arr.len() {
        Some(arr[index])
    } else {
        None
    }
}

}

Using interval analysis, the compiler can infer the interval for index and ensure that it remains within bounds ([0, arr.len())). If an interval of index is known to always be within this range, the compiler could potentially eliminate the bounds check, optimizing the code.

Rust Analysis Platform