Chapter 1. Introduction

Writing program analysis tools is challenging. In this project, we aim to provide a user-friendly framework for developing new static program analysis features for Rust. In particular, our project integrates dozens of classic program analysis algorithms, including those for pointer analysis, value-flow analysis, control-flow analysis, and more. Developers can choose and combine them like Lego blocks to achieve new bug detection or verification capabilities.

The following figure demonstrates the framework of RAPx, which is composed of a core layer and an application layer. The project repository of RAPx can be found here. Note that the project is still under heavy development, and some modules are not available yet.

framework

Chapter 2. Installation and Usage Guide

Platform Support

  • Linux
  • macOS (both x86_64 and aarch64 version)

Preparation

RAPx is based on Rust version nightly-2024-10-12. You can install this version using the following command.

rustup toolchain install nightly-2024-10-12 --profile minimal --component rustc-dev,rust-src,llvm-tools-preview

If you have multiple Rust versions, please ensure the default version is set to nightly-2024-10-12.

rustup show

Install

Download the project

git clone https://github.com/Artisan-Lab/RAPx.git

Build and install RAPx

./install.sh

You can combine the previous two steps into a single command:

cargo +nightly-2024-10-12 install rapx --git https://github.com/Artisan-Lab/RAPx.git

For macOS users, you may encounter compilation errors related to Z3 headers and libraries. There are two solutions:

The first one is to manually export the headers and libraries as follows:

export C_INCLUDE_PATH=/opt/homebrew/Cellar/z3/VERSION/include:$C_INCLUDE_PATH
ln -s /opt/homebrew/Cellar/z3/VERSION/lib/libz3.dylib /usr/local/lib/libz3.dylib

Alternatively, you can modify the Cargo.toml file to change the dependency of Z3 to use static linkage. However, this may significantly slow down the installation process, so we do not recommend enabling this option by default.

[dependencies]
z3 = {version="0.12.1", features = ["static-link-z3"]}

After this step, you should be able to see the RAPx plugin for cargo.

cargo --list

Execute the following command to run RAPx and print the help message:

cargo rapx -help
00:00:00|RAP|INFO|: 
Usage:
    cargo rapx [rapx options] -- [cargo check options]

RAPx Options:

Use-After-Free/double free detection.
    -F or -uaf       command: "cargo rapx -uaf"

Memory leakage detection.
    -M or -mleak     command: "cargo rapx -mleak"

Debugging options:
    -mir             print the MIR of each function

General command: 
    -H or -help:     show help information
    -V or -version:  show the version of RAPx
...

Uninstall

cargo uninstall rap

Usage

To analyze system software without std (e.g., Asterinas), try the following command:

cargo rapx -F -- --target x86_64-unknown-none

To analyze the Rust standard library, try the following command:

cargo rapx -stdsp -- -Z build-std --target x86_64-unknown-linux-gnu

Chapter 3. Framework of RAPx

Traditionally, performing code analysis requires modifying the compiler source code to add new passes. Developers then need to recompile the compiler to activate these new passes, which can be cumbersome. The Rust compiler offers a more portable way to conduct code analysis using the rustc_driver. We refer to this approach as the frontend method because it allows developers to directly access internal compiler data and perform analysis as callbacks.

Custom Cargo Commands

To support project-level program analysis, we want the analysis tool to be integrated into cargo as subcommands. To this end, we can name the tool as cargo-toolname and place it in $CARGO_HOME/bin or $PATH. Then we can execute the tool via the following command.

cargo toolname -more_arguments

Cargo will automatically search the binaries named cargo-toolname from the paths. The following figure demonstrates the whole process before reaching our analysis program. Workflow of how cargo dispatches the analysis command to rapx. Note that we cannot directly invoke rapx in the first round but through cargo check because we need cargo to manage the project-level compilation and append detailed compilation options for launching rustc. However, we want to hook rustc execution and execute rapx instead for analysis. Therefore, we set RUSTC_WRAPPER with the value of cargo-rapx. In this way, cargo check will actually run cargo-rapx rustc appended_options. We then dispath the execution to rapx with appended options.

Register Analysis Callbacks

Supposing the purpose is to execute a function named my_analysis, developers should design a new struct and implement the Callbacks Trait for the struct.

#![allow(unused)]
fn main() {
pub struct MyCallback {...}
impl Callbacks for MyCallback {
    fn after_analysis<'tcx>(&mut self, compiler: &Compiler, queries: &'tcx Queries<'tcx>) -> Compilation {
        compiler.session().abort_if_errors();
        queries.global_ctxt().unwrap().enter(
            |tcx| my_analysis(tcx, *self) // the analysis function to execute after compilation.
        );
        compiler.session().abort_if_errors();
        Compilation::Continue
    }
}
}

To execute the compiler and callback function, developers can employ the APIs rustc_driver::RunCompiler provided by Rust.

#![allow(unused)]

fn main() {
let mut callback = MyCallback::default();
let run_compiler = rustc_driver::RunCompiler::new(&args, callback);
let exit_code = rustc_driver::catch_with_exit_code(move || run_compiler.run());
}

Chatper 4. Preliminary: Compiler Internals

Compiler Internal

We list the essential official documents to read for better understanding the internal data structures of Rust compiler.

  • HIR
  • MIR
  • TyCtxt is the central data structure of Rust compilers. We can obtain the hir or mir of a function based on the object.
#![allow(unused)]
fn main() {
let hir = tcx.hir();
let mir = optimized_mir(def_id); // def_id is of type DefId
}

Command to Display HIR/MIR

Execute the following command to obtain the HIR/MIR of the source code.

cargo rustc -- -Z unpretty=hir-tree
cargo rustc -- -Zunpretty=mir

Chapter 5. Core Modules

This chapter introduces the core modules of RAPx, which implements several commonly used program analysis features, including alias analysis, dataflow analysis, control-flow analysis, and more. Please refer to the corresponding sub-chapters for more information.

Chapter 5.1. Alias Analysis

Alias analysis involves determining if two identifiers point to the same memory allocation. The task is challenging, with various options that balance precision and cost, including flow sensitivity, field sensitivity, and path sensitivity. In general, there are two main approaches to analysis: the lattice-based approach and the meet-over-all-paths (MOP) approach.

MOP-based Alias Analysis

Features and Examples

The approach performs alias analysis for each execution path of a target function and merges the results from different paths into a final result. When encountering function calls, it recursively analyzes the callees until all dependencies are resolved. This approach is flow-sensitive and field-sensitive but context-insensitive.

Case 1

In the following code, there are four possible paths depending on the enumeration value choice. However, only two of these paths are reachable. As a result, the return value of foo() is an alias of the first argument x; it cannot be an alias of the second argument y. Such an alias analysis result can be achieved using the MOP-based approach, but not the lattice-based approach.

#![allow(unused)]
fn main() {
enum Selector {
    First,
    Second,
}

fn foo<'a>(x: &'a i32, y: &'a i32, choice: Selector) -> &'a i32 {
    let a = match choice {
        Selector::First => x, 
        Selector::Second => y,
    };
    match choice {
        Selector::First => a, 
        Selector::Second => x,
    }
}
}

The corresponding MIR code is as follows:

#![allow(unused)]
fn main() {
fn foo(_1: &i32, _2: &i32, _3: Selector) -> &i32 {
    bb0: {
        StorageLive(_4);
        _5 = discriminant(_3);
        switchInt(move _5) -> [0: bb3, 1: bb1, otherwise: bb2];
    }

    bb1: {
        _4 = _2;
        goto -> bb4;
    }

    bb2: {
        unreachable;
    }

    bb3: {
        _4 = _1;
        goto -> bb4;
    }

    bb4: {
        _6 = discriminant(_3);
        switchInt(move _6) -> [0: bb6, 1: bb5, otherwise: bb2];
    }

    bb5: {
        _0 = _1;
        goto -> bb7;
    }

    bb6: {
        _0 = _4;
        goto -> bb7;
    }

    bb7: {
        StorageDead(_4);
        return;
    }
}
}

Since 0 and 1 are the identifiers for the return value and the first argument x correspondingly, the expected result of alias analysis is (0, 1).

Case 2: field-sensitivity

In the following example, the return value of foo() is an alias of the first field of its first argument.

#![allow(unused)]
fn main() {
struct Point {
    x: i32,
    y: i32,
}

fn foo(p1: &Point) -> &i32 {
    &p1.y
}
}

The corresponding MIR code is as follows:

#![allow(unused)]
fn main() {
fn foo(_1: &Point) -> &i32 {
   bb0: {
        _0 = &((*_1).1: i32);
        return;
    }
}
}

The alias analysis result should be (0, 1.1).

Quick Usage Guide

Developers can test the feature using the following command:

cargo rapx -alias=mop

For example, we can apply the mop analysis to the first case, and the result is as follows:

Checking alias_mop_field...
21:50:18|RAP|INFO|: Start analysis with RAP.
21:50:18|RAP|INFO|: Alias found in Some("::boxed::{impl#0}::new"): {(0.0,1)}
21:50:18|RAP|INFO|: Alias found in Some("::foo"): {(0,1.1),(0,1.0)}

When applying the mop analysis to the first case, and the result is as follows:

Checking alias_mop_switch...
21:53:37|RAP|INFO|: Start analysis with RAP.
21:53:37|RAP|INFO|: Alias found in Some("::foo"): {(0,2),(0,1)}
21:53:37|RAP|INFO|: Alias found in Some("::boxed::{impl#0}::new"): {(0.0,1)}

To utilize the analysis results in other analytical features, developers can use mop as follows:

#![allow(unused)]
fn main() {
use analysis::core::alias::mop::MopAlias; // import the module
MopAlias::new(tcx).start(); // new a MopAlias object with .
}

The code above performs alias analysis for each function, recording the alias pairs between two arguments or between an argument and the return value. The results can be retrieved by decoding the following data structures.

#![allow(unused)]
fn main() {
pub type FnMap = FxHashMap<DefId, FnRetAlias>;
pub struct MopAlias<'tcx> {
    pub tcx: TyCtxt<'tcx>,
    pub fn_map: FnMap,
}

pub struct FnRetAlias {
    pub arg_size: usize,
    pub alias_vec: Vec<RetAlias>,
}

pub struct RetAlias{
    pub left_index: usize,
    pub left_field_seq: Vec<usize>, 
    pub left_may_drop: bool, 
    pub left_need_drop: bool,
    pub right_index: usize,
    pub right_field_seq: Vec<usize>,
    pub right_may_drop: bool, 
    pub right_need_drop: bool,
}
}

Key Steps of Our MOP Algorithm

There are three key steps (source code):

#![allow(unused)]
fn main() {
let mut mop_graph = MopGraph::new(self.tcx, def_id);
mop_graph.solve_scc();
mop_graph.check(0, &mut self.fn_map);
}
  • Graph preparation: Construct the control-flow graph for the target function. See the source code.
  • SCC shrinkage: Extract the strongly connected components (SCCs) and shrink SCCs of the control-flow graph. See the source code.
  • Alias Check: Traversal the control-flow graph and perform alias analysis. See the source code

Reference

The feature is based on our SafeDrop paper, which was published in TOSEM.

@article{cui2023safedrop,
  title={SafeDrop: Detecting memory deallocation bugs of rust programs via static data-flow analysis},
  author={Mohan Cui, Chengjun Chen, Hui Xu, and Yangfan Zhou},
  journal={ACM Transactions on Software Engineering and Methodology},
  volume={32},
  number={4},
  pages={1--21},
  year={2023},
  publisher={ACM New York, NY, USA}
}

Chapter 5.2. API-Dependency Graph

The feature is based on our RuMono paper, which was published in TOSEM.

@article{zhangrumono,
  title={RuMono: Fuzz Driver Synthesis for Rust Generic APIs},
  author={Zhang, Yehong and Wu, Jun and Xu, Hui},
  journal={ACM Transactions on Software Engineering and Methodology},
  publisher={ACM New York, NY}
}

Chapter 5.3. Control-flow Analysis

Overview

Control-flow analysis is a critical component of static analysis, enabling developers to understand the flow of a program and identify potential issues before runtime. It tracks the order in which individual statements, instructions or function calls of an imperative program are executed or evaluated.

Among the carious tools and techniques used in this analysis, call graphs stand out as powerful representations of function invocations within a program. A call graph is a control-flow graph, which represents calling relationships between subroutines in a computer program. Each node represents a procedure and each edge (f,g) indicates the caller and callee relationship between f and g.

In our program, we provide the feature to generate static call graphs and store the result in a graph data structure implemented with adjacent list.

Quick Start

You can use the feature with the following command:

cargo rapx -callgraph

Graph APIs

To utilize the analysis results as you want, you can use our module as follows:

#![allow(unused)]
fn main() {
use analysis::core::call_graph::Callgraph;  // import module
let callgraph = Callgraph::new(tcx);    // create a callgraph object
callgraph.start();  // do the analysis
}

get_callee_def_path

#![allow(unused)]
fn main() {
pub fun get_callee_def_path(&self, def_path: String) -> Option<HashSet<String>>{
    ...
}
}

You can get all callees define path returned in a hash set with the caller define path.

Generating Call Graphs

Working with MIR (Mid-level Intermediate Representation)

MIR is Rust's intermediate representation used during compilation. It simplifies control flow by breaking down functions into a series of basic blocks, making it easier to analyze.

Our program make good use of this handy tool to help generate call graphs. We analyze what specific kind the current basic block belongs to, which represents ways of existing from a basic block.

#![allow(unused)]
fn main() {
pub struct Body<'tcx> {
    pub basic_blocks: BasicBlocks<'tcx>,
    ..
}

}

As basic_blocks exist in the struct Body, we should get Body first. There are some functions to help us with this:

  1. optimized_mir
  2. mir_for_ctfe ("stfe" stands for "Compile Time Function Evaluation")

In the case of DefKind::Const and DefKind::static, mir_for_ctfe is necessary, since rust prevents us from applying optimization to const or static ones.

Inside the BasicBlockData we can get Terminator.

#![allow(unused)]
fn main() {
pub enum TerminatorKind<'tcx> {
    Call {
        func: Operand<'tcx>,
        ..
    },
    ..
}
}

Our target is to get the Call in the enum TerminatorKind. Therefore we can use the resolve function to get define id of callee functions.

With define id (i.e. def_id), it is easy to apply it to def_path_str method to get define path and construct our call graphs.

Case Study: Dynamic Trait Analysis

It is nontrivial to analyze a program applying dynamic trait. Since we have to do our analysis statically, we guarantee the analysis sound and safe. In the case of dynamic trait analysis, we give out the "maybe" answer, e.g.:

|RAP|INFO|: 8:main -> 7:(dyn trait) <* as Animal>::make_sound

Chapter 5.4. Heap Item Analysis

Here we first give several definitions for further usage in this section. These definitions can been referred in the paper of rCanary.

  • Heap Item: A heap item is an object containing the heap allocation with a fixed type (layout).
  • Heap-item Unit: A heap-item unit has a PhantomData<T> field with a typed generic parameter and a pointer field to store the value T.
  • Isolated Parameter: An isolated parameter is a stand-alone and typed generic parameter (e.g., T but not &T).

In the type analysis, we need to generate an abstraction for all rust types used in the local crate and dependencies. Each type has a layout summary to illustrate the data abstraction in the fully initialized state.

The summary comprises two parts:

  1. a boolean value indicating if it contains a heap item;
  2. a boolean vector indicating if each generic parameter is an isolated parameter.

For example, the summary of the vector Vec<T,A> is (1) [0, 1].

Example of type analysis.

Overview

We give the overall framework of type analysis as below.

Framework of type analysis.

Then we will discuss the workflow of each component in this figure.

The visitor method is main pass of the constructor part in type analysis, it will perform several important procedural to determine whether an adt definition (adt-def) will occupy at least one heap allocation, reflecting holding heap-ownership in OBRM system.

For example, given an AdtDef (like Vec<T>), the result of visitor contains two parts:

  • pt1 Enum: {Owned / UnOwned} indicates whether it will directly have a heap data
  • pt2 Array: [bool;N] indicates whether each generic parameter will have a raw parameter

Those parts can accelerate heap-ownership inference in the data-flow analysis.

From the top-down view, visitor has several sub-phases which means it contains multiple sub-visitors to make whole method self.visitor() work.

Type Collector

We use an inter-procedural visitor to collect all the types for local crate and the types are after monomorphism. Our goal is to extract all the AdtDefs.

The struct core::heap_item::TypeAnalysis implements mir::Visitor.

#![allow(unused)]
fn main() {
pub struct TypeAnalysis<'tcx, 'a> {
    rcx: &'a mut rCanary<'tcx>,
    fn_set: Unique,
    ty_map: TyMap<'tcx>,
    adt_recorder: Unique,
}
}

Note: the type in this phase is Ty::ty rather of Hir::ty.

As we can see, the function visit_ty will flatten all the fields and generics appeared. It can cover almost all the rust types.

#![allow(unused)]
fn main() {
impl<'tcx, 'a> TypeAnalysis<'tcx, 'a> {
    fn visit_ty(&mut self, ty: Ty<'tcx>, ty_context: TyContext) {
        match ty.kind() {
            TyKind::Adt(adtdef, substs) => {
                // ...
                for field in adtdef.all_fields() { self.visit_ty(field.ty(self.tcx(), substs) ,copy_ty_context(&ty_context)) }
                for ty in substs.types() { self.visit_ty(ty, copy_ty_context(&ty_context)); } },
            TyKind::Array(ty, ..) => { self.visit_ty(*ty, ty_context); },
            TyKind::Slice(ty) => { self.visit_ty(*ty, ty_context); },
            TyKind::RawPtr(typeandmut) => { let ty = typeandmut.ty; self.visit_ty(ty, ty_context); },
            TyKind::Ref(_, ty, ..) => { self.visit_ty(*ty, ty_context); },
            TyKind::Tuple(tuple_fields) => {
                for field in tuple_fields.iter() {
                    self.visit_ty(field, copy_ty_context(&ty_context));
                } },
            _ => return,
        }
    }
}
}

The AdtDef are finally stored in the field adt_recorder. After the collecting all AdtDef, we have 4 sub-phases to generate the summary for each type.

#![allow(unused)]
fn main() {
start_channel(|did| self.extract_isolaoted_parameter(did), &dids);
start_channel(|did| self.extract_isolaoted_parameter_prop(did), &dids);
start_channel(|did| self.extract_heap_item_unit(did), &dids);
start_channel(|did| self.extract_owner_prop(did), &dids);
}

Phase I: Extract Isolated Parameters

Extract isolated params in adt types, the param means one generic parameter acting like T, A, etc...

In the sub-visitor heap_item::IsolatedParameter, it will visit the given type recursively, and extract all params to construct a mapping.

#![allow(unused)]
fn main() {
#[derive(Clone)]
struct IsolatedParameter<'tcx> {
    tcx: TyCtxt<'tcx>,
    parameters: Parameters,
}
}

Note we only interest in isolated params (T not like *mut T). It lies in one-entire field | recursive in tuple | recursive in array | mixed before.

The implementation can be found in function extract_raw_generic and trait impl impl TypeVisitor for IsolatedParameter.

Here, we give some examples of extracting isolated parameters. Given a struct Example<A, B, T, S>:

#![allow(unused)]
fn main() {
 struct Example<A, B, T, S> {
        a: A,
        b: (i32, (f64, B)),
        c: [[(S) ; 1] ; 2],
        d: Vec<T>,
}
}

The final result for <A, B, T, S> is <true, true, false, true>.

Phase II: Propagate Isolated Parameter

Extract all params in the adt types like param T and then propagate from the bottom to top.

This procedural is the successor of extracting isolated parameters, and the key point of IsolatedParameterPropagation is to propagate params from bottom AdtDef to the top as well as updating TypeAnalysis Context.

Note that it will thorough consider mono-morphization existed in the AdtDef. That means the type Vec<T>, Vec<Vec<T>> and Vec<i32> are different!!!!

Given a struct Example<A, B, T, S>:

#![allow(unused)]
fn main() {
struct X<A> {
        a: A,
}
struct Y<B> {
    a: (i32, (f64, B)),
    b: X<i32>,
}
struct Example<A, B, T, S> {
    a: X<A>,
    b: (i32, (f64, B)),
    c: [[(S) ; 1] ; 2],
    d: Vec<T>,
}
}

The result for X<A> is <true>, for Y<B> is <true>, and for Example<A, B, T, S> is <true, true, false, true>.

Phase III: Extract Heap-item Unit

Next, we need to extract all types that has PhantomData<T> inside which T must be a isolated parameter. Consider these types as a unit to guide the traversal over adt types.

As for one heap-item unit, only struct will contains the information that we want like

#![allow(unused)]
fn main() {
// Example:
struct Foo<T> {
    NonNull<T>,      // this indicates a pointer
    PhantomData<T>,  // this indicates a ownership
}
}

In fact, the Heap-item unit we defined here needs to meet the following three criteria:

  1. Within the current struct, there must be a PhantomData.
  2. The generic parameter of this PhantomData cannot be a reference.
  3. There is a pointer field, and the type of the pointer field matches that of PhantomData.

Thus, we mark each found data of this type as a Heap-item Unit.

Phase IV: Propagate Heap Item Result

All processes in step 4 are the same as in step 2, except that the search content changes from "isolated parameters" to "heap-item unit", corresponding to the implemented for core::heap_item::OwnerPropagation.

#![allow(unused)]
fn main() {
#[derive(Clone)]
struct OwnerPropagation<'tcx, 'a> {
    tcx: TyCtxt<'tcx>,
    ownership: RawTypeOwner,
    unique: Unique,
    ref_adt_owner: &'a AdtOwner,
}
}

APIs Provided

To facilitate the use of the above analysis results, we provide the following well-packaged APIs for users to use, each API is configured with comprehensive documentation and usage instructions.

  1. core::heap_item::IsolatedParameter is used to determine if the type has an isolated parameter.
  2. core::heap_item::HeapItem is used to determine if the type has a heap item.
  3. core::heap_item::Encoder encodes the type into a binary vector of 0-1.

All the above structures are ZSTs, constructing the structure does not incur any additional overhead.

Its method requires at least 2 parameters, namely the global context of rcx for rCanary and the type ty. We also provide a series of APIs to check if composite structure types' fields and variants meet specific requirements or directly obtain all analysis results.

Chapter 5.5. Interval Analysis in Rust (TO ADD)

Overview

Interval analysis is a type of static analysis used to track the range (interval) of possible values that a variable can take during the execution of a program. By maintaining an upper and lower bound for each variable, the analysis helps optimize code by narrowing the values that a program may handle. In Rust, interval analysis is particularly important in verifying safety properties, such as ensuring that an array access is always within bounds or that certain calculations do not result in overflows.

Goals of Interval Analysis

  • Range detection: Identify the intervals of values a variable may assume at different points in the program.
  • Error prevention: Ensure safe memory accesses (e.g., avoiding out-of-bound accesses in arrays).
  • Optimization: Provide opportunities for compiler optimizations by deducing more precise value ranges.

Flow Sensitivity

Interval analysis in Rust can be flow-sensitive, meaning that it accounts for the different execution paths a program might take. This allows the analysis to track how intervals change as variables are assigned or modified during the execution flow, improving the precision of analysis.

Lattice-based Approach

In this approach, values of variables are represented in a lattice, where each element represents an interval. A lattice ensures that each combination of intervals has a defined result, and merging different paths of a program is done by taking the least upper bound (LUB) of intervals from each path.

For example, if a variable x can have an interval [0, 10] on one path and [5, 15] on another path, the merged interval would be [0, 15] because that represents the union of both possible value ranges.

Meet-over-all-paths (MOP) Approach

In the meet-over-all-paths (MOP) approach, the analysis is performed by considering every possible path through a program and merging the results into a final interval. This approach is path-sensitive but may be less scalable on large programs because it needs to account for all paths explicitly.

Example: Lattice-based Interval Analysis

Consider the following Rust function that updates the value of a variable x based on certain conditions:

#![allow(unused)]
fn main() {
fn example(x: i32, y: i32) -> i32 {
    if x > 0 {
        x + 5
    } else {
        y + 2
    }
}

}

The variable x can either increase by 5 if it’s positive or remain unchanged (using y) if it’s not. An interval analysis would consider two scenarios for x:

  • On the path where x > 0, x will take values from [x+5, ∞] if x is positive.
  • On the other path, y will determine the result, and the interval of y is tracked separately.

Thus, the final result interval merges these possibilities.

MIR Representation

Rust’s intermediate representation (MIR) is key to performing interval analysis effectively. Analyzing MIR allows the compiler to track variable assignments and conditions at a low level, increasing the precision of the analysis.

#![allow(unused)]
fn main() {
fn example(_1: i32, _2: i32) -> i32 {
    bb0: {
        switchInt(_1) -> [0: bb1, otherwise: bb2];
    }

    bb1: {
        _0 = _2 + const 2;
        return;
    }

    bb2: {
        _0 = _1 + const 5;
        return;
    }
}

}

In this MIR code, the interval analysis would track how _0 is assigned based on the conditions in different basic blocks, allowing for optimizations and safety checks.

Case Study: Array Bound Checking

Consider the case where interval analysis can be used to check array bounds. Rust guarantees memory safety through strict bounds checking, but interval analysis can help eliminate redundant checks or prevent unsafe accesses during compile-time.

#![allow(unused)]
fn main() {
fn get_element(arr: &[i32], index: usize) -> Option<i32> {
    if index < arr.len() {
        Some(arr[index])
    } else {
        None
    }
}

}

Using interval analysis, the compiler can infer the interval for index and ensure that it remains within bounds ([0, arr.len())). If an interval of index is known to always be within this range, the compiler could potentially eliminate the bounds check, optimizing the code.

Quick Usage Guide

Developers interested in interval analysis can explore tools or experimental Rust compiler flags for static analysis. To integrate interval analysis into a Rust project, the following steps are typical:

  1. Define intervals for each variable.
  2. Propagate intervals throughout the control flow of the function.
  3. Merge intervals at control flow joins using a lattice or MOP-based approach.
  4. Check safety properties, such as bounds checking, based on the computed intervals.

Key Steps of Interval Analysis Algorithm

  1. Graph Construction: Build the control-flow graph (CFG) for the function.
  2. Interval Propagation: Traverse the CFG and propagate intervals for variables through each basic block.
  3. Merge Intervals: At each control-flow merge point, take the least upper bound of intervals.
  4. Validation: Verify if the intervals meet safety properties like array bounds and valid arithmetic operations.

Chapter 5.6. Value-flow Analysis

Value-flow analysis tracks value flow in the program including copy, move, ref, and other scenarios. With this module, users can easily have a whole picture of the value flow in any function and query whether there is value dependency between two mir local variables.

This module defines a graph data structure to store the data parsed from Rust mir. The graph nodes are indexed by Local which is defined by rustc. The edges between the nodes also define the data dependency relationship.

Quick Usage Guide

Developers can test the feature using the following command:

cargo rapx -dataflow

To switch in debug mode and draw the graph with graphviz, execute the following command.

cargo rapx -dataflow=debug

For example, we can apply the value flow analysis to the dangling_min case, and the result of function create_vec is as follows:

#![allow(unused)]
fn main() {
fn create_vec() -> *mut Vec<i32> {
    let mut v = Vec::new();
 //Fix: let mut v = Box::new(Vec::new());
    v.push(1);
    &mut v as *mut Vec<i32>
 //Fix: Box::into_raw(v)
}
}

To utilize the analysis results in other analytical features, developers can use mop as follows:

#![allow(unused)]
fn main() {
use analysis::core::dataflow::Dataflow; // import the module
let dataflow = Dataflow::new(tcx); // create a dataflow object
dataflow.build_graphs(); // parse all the functions in tcx and build corresponding graphs
}

Graph APIs

DFS

This function uses precedence traversal. The node operator and edge validator decide how far the traversal can reach with the help of return values. You can also modify outside variables captured by these two closures to record data during the DFS. traverse_all decides if a branch finds the target successfully, and whether the traversal will continue or not.

For example, if you need to instantly stop the traversal once finding a certain node, then set traverse_all to false. If you want to traverse all the reachable nodes which are decided by the operator and validator, then set traverse_all to true.

#![allow(unused)]
fn main() {
pub fn dfs<F, G>(&self, now: Local, direction: Direction, node_operator: &mut F, edge_validator: &mut G, traverse_all: bool) -> DFSStatus
where 
 F: FnMut(&Graph, Local) -> DFSStatus,
 G: FnMut(&Graph, EdgeIdx) -> DFSStatus,
}

is_connected

This function is built upon the DFS API. It tries to find idx_2 from idx_1, upside first then downside.

#![allow(unused)]
fn main() {
pub fn is_connected(&self, idx_1: Local, idx_2: Local) -> bool
}

Chapter 6. RAP Applications

This section introduces several applications of RAPx, covering both security and performance use cases. For security, RAP currently supports the detection of dangling pointer and memory leak bugs, and it can be used to audit unsafe code. Please refer to the sub-chapters for more information.

Chapter 6.1. Dangling Pointer Detection

Rust uses ownership-based resource management (OBRM) and automatically deallocates unused resources without a garbage collector. This approach can potentially lead to premature memory deallocation, resulting in use-after-free or double-free errors. A significant portion of these bugs is related to the unwinding path, making them difficult to detect through testing or dynamic analysis. For more details, please refer to the SafeDrop paper published in TOSEM.

PoC

Below is a toy example demonstrating a Rust program with a use-after-free bug. The Data object is automatically dropped once the program exits the inner block scope of main. Accessing ptr after this point triggers a use-after-free error. You can find several such PoC examples within the folder of RAPx/tests.

struct Data {
    value: Box<i32>,
}

impl Data {
    fn new(value: i32) -> Data {
        Data { value:Box::new(value) }
    }

    fn print_value(&self) {
        println!("Value: {}", self.value);
    }
}

fn main() {
    let ptr: *const Data;
    
    {
        let data = Data::new(42);
        ptr = &data as *const Data;
    } // data is automatically dropped here, leaving ptr dangling.

    unsafe {
        (*ptr).print_value(); // use-after-free.
    }
}

Usage

To detect such bugs, navigate to the project directory and execute the following command.

cargo rapx -uaf

RAPx outputs a warning message in yellow if bugs are detected. For example, when applying RAP to the previous PoC, the log message appears as follows:

21:49|RAP-FRONT|WARN|: Use after free detected in function "main"
21:49|RAP-FRONT|WARN|: Location: src/main.rs:27:9: 27:20 (#0)
21:49|RAP-FRONT|WARN|: Location: src/main.rs:27:9: 27:34 (#0)

Mechanism

There are two essential steps in detecting such bugs:

  • Alias preparation We begin by running an alias model to identify the aliases among the arguments and return value of each function. This result is later used to streamline inter-procedural analysis. The current implementation is based on an mop-based alias analysis module.
  • Bug detection We traverse each path of a target function, performing alias analysis at each program point along the way. During this process, we check for any incorrect use or return of dangling pointers.

The implementation can be found as a module named SafeDrop.

Reference

The feature is based on our SafeDrop paper, which was published in TOSEM.

@article{cui2023safedrop,
  title={SafeDrop: Detecting memory deallocation bugs of rust programs via static data-flow analysis},
  author={Mohan Cui, Chengjun Chen, Hui Xu, and Yangfan Zhou},
  journal={ACM Transactions on Software Engineering and Methodology},
  volume={32},
  number={4},
  pages={1--21},
  year={2023},
  publisher={ACM New York, NY, USA}
}

Chapter 6.2. Memory Leakage Detection

Rust employs a novel ownership-based resource management model to facilitate automated deallocation during compile time. However, sometimes developers may interventionally drives it into mannualy drop mode and which is prone to memory leak.

rCanary is a static model checker to detect leaks across the semi-automated boundary.

We support the detection of the following two types of issues:

  1. Orphan Object: An orphan object is the heap item wrapped by the smart pointer ManuallyDrop.
fn main() {
    let mut buf = Box::new("buffer");
    // heap item ’buf’ becomes an orphan object
    let ptr = Box::into_raw(buf);
    // leak by missing free operation on ’ptr’
    // unsafe { drop_in_place(ptr); }
}
  1. Proxy type: A proxy type is a compound type having at least one field that stores an orphan object.
struct Proxy<T> {
    ptr: * mut T,
}
impl<T> Drop for Proxy<T> {
    fn drop(&mut self) {
        // user should manually free the field ’ptr’ 
        // unsafe { drop_in_place(self.ptr); } 
    }
    
    fn main() {
        let mut buf = Box::new("buffer");
        // heap item ’buf’ becomes an orphan object
        let ptr = &mut * ManuallyDrop::new(buf) as * mut _;
        let proxy = Proxy { ptr }; 
        // leak by missing free ’proxy.ptr’ in drop 
    }
}

What's behind rCanary?

Canary is a component of RAPx, and the overall architecture is as shown in the following diagram: Framework of type analysis.

It can generate SMT-Lib2 format constraints for Rust MIR and is implemented as a Cargo component. We design an encoder to abstract data with heap allocation and formalize a refined leak-free memory model based on boolean satisfiability.

#![allow(unused)]
fn main() {
pub fn start_analyzer(tcx: TyCtxt, config: RapConfig) {
    let rcx_boxed = Box::new(RapGlobalCtxt::new(tcx, config));
    let rcx = Box::leak(rcx_boxed);

    let _rcanary: Option<rCanary> = if callback.is_rcanary_enabled() {
        let mut rcx = rCanary::new(tcx);
        rcx.start();
        Some(rcx)
    } else {
        None
    };
}

pub fn start(&mut self) {
    let rcx_boxed = Box::new(rCanary::new(self.tcx));
    let rcx = Box::leak(rcx_boxed);
    TypeAnalysis::new(rcx).start();
    FlowAnalysis::new(rcx).start();
}
}

The start_analyzer function defines all the analysis processes of rCanary, with two important steps being TypeAnalysis and FlowAnalysis, corresponding to the ADT-DEF Analysis, constraint construction, and constraint solving described in the paper.

Running rCanary

Invocation

Before using rCanary, please make sure that the Z3 solver is installed in your operating system.

If not, run: brew install z3 or apt-get install z3, with a minimum version of 4.10 for Z3.

Running rCanary within RAPx is very simple, just enter the sysroot (root directory) of a Cargo program and run:

cargo rapx -mleak

Note: Analysis will be terminated if the directory or toolchain version is not nightly-2024-06-30.

Additional Configure Arguments

rCanary also provides several optional environment variables output sections for intermediate results.

  • ADTRES
    • Print the results of type analysis, including the type definition and the analysis tuple.
  • Z3GOAL
    • Emit the Z3 formula of the given function, it is in the SMT-Lib2 format.
  • ICXSLICE
    • Set Verbose to print the middle metadata for rCANARY debug.

Note: These parameters may change due to version migration.

Running Results on PoC

For the test case, we chose the same PoC as in the paper to facilitate user understanding and testing.

fn main() {
    let mut buf = Box::new("buffer");
    // heap item ’buf’ becomes an orphan object
    let ptr = Box::into_raw(buf);
    // leak by missing free operation on ’ptr’
    // unsafe { drop_in_place(ptr); }
}

Running cargo rapx -mleak:

22:10:39|RAP|WARN|: Memory Leak detected in function main
warning: Memory Leak detected.
--> src/main.rs:3:16
  |
1 | fn main() { 
2 |     let buf = Box::new("buffer");
3 |     let _ptr = Box::into_raw(buf);
  |                ------------------ Memory Leak Candidates.
4 | }

ADTRES

#![allow(unused)]
fn main() {
core::fmt::rt::Placeholder [(Unowned, [])]
std::ptr::Unique<T/#0> [(Owned, [false])]
std::convert::Infallible []
std::marker::PhantomData<T/#0> [(Unowned, [false])]
std::boxed::Box<T/#0, A/#1> [(Owned, [false, true])]
std::alloc::Global [(Unowned, [])]
std::alloc::Layout [(Unowned, [])]
std::mem::ManuallyDrop<T/#0> [(Unowned, [true])]
std::result::Result<T/#0, E/#1> [(Unowned, [true, false]), (Unowned, [false, true])]
std::alloc::AllocError [(Unowned, [])]
core::fmt::rt::ArgumentType<'a/#0> [(Unowned, [false]), (Unowned, [false])]
core::fmt::rt::Count [(Unowned, []), (Unowned, []), (Unowned, [])]
std::fmt::Arguments<'a/#0> [(Unowned, [false])]
std::ptr::NonNull<T/#0> [(Unowned, [false])]
std::ptr::Alignment [(Unowned, [])]
core::fmt::rt::Alignment [(Unowned, []), (Unowned, []), (Unowned, []), (Unowned, [])]
std::ops::ControlFlow<B/#0, C/#1> [(Unowned, [false, true]), (Unowned, [true, false])]
core::fmt::rt::Argument<'a/#0> [(Unowned, [false])]
std::option::Option<T/#0> [(Unowned, [false]), (Unowned, [true])]
std::ptr::alignment::AlignmentEnum [(Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, []), (Unowned, [])]
}

Z3GOAL

#![allow(unused)]
fn main() {
(goal
  |CONSTRAINTS: T 0|
  (= |0_0_1_ctor_fn| #b01)
  |CONSTRAINTS: S 1 2|
  (= |1_2_1| #b00)
  (= |1_2_3_ctor_asgn| |0_0_1_ctor_fn|)
  |CONSTRAINTS: T 1|
  (= |1_0_3_drop_all| (bvand |1_2_3_ctor_asgn| #b10))
  (= |1_0_2_ctor_fn| #b1)
  |CONSTRAINTS: S 2 1|
  |CONSTRAINTS: T 2|
  (= |2_0_1_return| |1_2_1|)
  (= |2_0_1_return| #b00)
  (= |2_0_2_return| |1_0_2_ctor_fn|)
  (= |2_0_2_return| #b0)
  (= |2_0_3_return| |1_0_3_drop_all|)
  (= |2_0_3_return| #b00))
}

ICXSLICE

#![allow(unused)]
fn main() {
IcxSlice in Terminator: 0: _1 = std::boxed::Box::<&str>::new(const "buffer") -> [return: bb1, unwind continue]
IcxSliceForBlock
     [Taint { set: {} }, Taint { set: {} }, Taint { set: {} }, Taint { set: {} }]
     [0, 2, 0, 0]
     [Declared, Init(|0_0_1_ctor_fn|), Declared, Declared]
     [[], [Owned, Unowned], [], []]
     [TyWithIndex(None), TyWithIndex(Some((2, std::boxed::Box<&'{erased} str, std::alloc::Global>, None, true))), TyWithIndex(None), TyWithIndex(None)]

IcxSlice in Assign: 1 2: Assign((_3, move _1))
IcxSliceForBlock
     [Taint { set: {} }, Taint { set: {} }, Taint { set: {} }, Taint { set: {} }]
     [0, 2, 0, 2]
     [Declared, Init(|1_2_1|), Declared, Init(|1_2_3_ctor_asgn|)]
     [[], [Owned, Unowned], [], [Owned, Unowned]]
     [TyWithIndex(None), TyWithIndex(Some((2, std::boxed::Box<&'{erased} str, std::alloc::Global>, None, true))), TyWithIndex(None), TyWithIndex(Some((2, std::boxed::Box<&'{erased} str, std::alloc::Global>, None, true)))]

IcxSlice in Terminator: 1: _2 = std::boxed::Box::<&str>::into_raw(move _3) -> [return: bb2, unwind: bb3]
IcxSliceForBlock
     [Taint { set: {} }, Taint { set: {} }, Taint { set: {TyWithIndex(Some((2, std::boxed::Box<&'{erased} str, std::alloc::Global>, None, true)))} }, Taint { set: {} }]
     [0, 2, 1, 2]
     [Declared, Init(|1_2_1|), Init(|1_0_2_ctor_fn|), Init(|1_0_3_drop_all|)]
     [[], [Owned, Unowned], [Unowned], [Owned, Unowned]]
     [TyWithIndex(None), TyWithIndex(Some((2, std::boxed::Box<&'{erased} str, std::alloc::Global>, None, true))), TyWithIndex(Some((1, *mut &'{erased} str, None, true))), TyWithIndex(Some((2, std::boxed::Box<&'{erased} str, std::alloc::Global>, None, true)))]

IcxSlice in Assign: 2 1: Assign((_0, const ()))
IcxSliceForBlock
     [Taint { set: {} }, Taint { set: {} }, Taint { set: {TyWithIndex(Some((2, std::boxed::Box<&'{erased} str, std::alloc::Global>, None, true)))} }, Taint { set: {} }]
     [0, 2, 1, 2]
     [Declared, Init(|1_2_1|), Init(|1_0_2_ctor_fn|), Init(|1_0_3_drop_all|)]
     [[], [Owned, Unowned], [Unowned], [Owned, Unowned]]
     [TyWithIndex(None), TyWithIndex(Some((2, std::boxed::Box<&'{erased} str, std::alloc::Global>, None, true))), TyWithIndex(Some((1, *mut &'{erased} str, None, true))), TyWithIndex(Some((2, std::boxed::Box<&'{erased} str, std::alloc::Global>, None, true)))]

IcxSlice in Terminator: 2: return
IcxSliceForBlock
[Taint { set: {} }, Taint { set: {} }, Taint { set: {TyWithIndex(Some((2, std::boxed::Box<&'{erased} str, std::alloc::Global>, None, true)))} }, Taint { set: {} }]
[0, 2, 1, 2]
[Declared, Init(|1_2_1|), Init(|1_0_2_ctor_fn|), Init(|1_0_3_drop_all|)]
[[], [Owned, Unowned], [Unowned], [Owned, Unowned]]
[TyWithIndex(None), TyWithIndex(Some((2, std::boxed::Box<&'{erased} str, std::alloc::Global>, None, true))), TyWithIndex(Some((1, *mut &'{erased} str, None, true))), TyWithIndex(Some((2, std::boxed::Box<&'{erased} str, std::alloc::Global>, None, true)))]
}

Reference

The feature is based on our rCanary work, which was published in TSE

@article{cui2024rcanary,
  title={rCanary: rCanary: Detecting memory leaks across semi-automated memory management boundary in Rust},
  author={Mohan Cui, Hongliang Tian, Hui Xu, and Yangfan Zhou},
  journal={IEEE Transactions on Software Engineering},
  year={2024},
}

Chapter 6.3. Unsafety Code Audit

The unsafety propogation graph(UPG) is a novel method to model the essential usage and encapsulation of unsafe code. UPG combines the traditional call graph with unsafe and dataflow information from Rust to capture the use and propagation of unsafe code across the project.

Within the UPG, there contains four major isolation types and nine structural patterns to split a UPG into several small self-contained subgraphs. These subgraphs called unsafety isolation graphs(UIGs) can serve as useful audit units for examining the soundness of unsafe code encapsulation.

We will continue to explore and investigate more effective applications of UIG in the encapsulation of unsafe code in our subsequent research.

Before using this feature, make sure that graphviz is installed on your device

usage

generate upg

Get into the same directory as cargo.toml and run the cmd below from the teminal.

cargo rapx -upg

Then 'rapx' will create a directory named 'UPG' in the same level, which contains several connected unsafety propogation graphs.

count uig

This feature is numerically calculated based on the type of all uig in the target crate

cargo rapx -uig

check safety doc

With the following cmd, users can check for unsafe apis that lack an unsafe document annotation

cargo rapx -doc

RAPx will output the corresponding unsafe api information, including its location and visibility level as follows:

Lack of unsafety doc: src/lib.rs:1668:1: 1674:11 (#0), visibility:Restricted(DefId(0:0 ~ smallvec[ecd8])).
Lack of unsafety doc: src/lib.rs:1699:1: 1704:11 (#0), visibility:Restricted(DefId(0:0 ~ smallvec[ecd8])).
Lack of unsafety doc: src/lib.rs:960:5: 960:45 (#0), visibility:Restricted(DefId(0:0 ~ smallvec[ecd8])).

audit unsafe APIs' SP in core and alloc

Specifically, we currently integrate a set of SP labels analysis for core and alloc crate of the Rust standard library.

  1. Create a new helloworld project.
  2. Navigate to the helloworld project directory and run:
cargo rapx -stdsp -- -Z build-std --target x86_64-unknown-linux-gnu > /your/output/log/path/result.log 2>&1

Replace /your/output/log/path with your desired output directory. This will output APIs where the caller and callee have different SPs.

unsafe code annotation and sp analysis

These features are under development.

framework

The overall framework of this feature is consistent with rap's frontend framework. See the picture below for details. Framework of unsafe code propogation analysis.

Chapter 6.4. Performance Bug Detection

This module is designed to identify performance bottlenecks and inefficiencies within a program by employing static analysis methods. After the analysis, rap will report potential code inefficiencies and their positions.

This module is still under development.

Performance Bugs Supported

Bounds checking

An example of unnecessary bounds checking is as follows.

#![allow(unused)]
fn main() {
fn foo(mut a: Vec<i32>) {
    for i in 0..a.len() {
        a[i] = a[i] + 1;
    }
}
}

Here RAPx will suggest replacing the safe index API with unsafe APIs.

Memory cloning

RAPx will report a warning where a cloned object is used as a immutable value. Developers need to manually check whether to remove the cloning or not.

An example is as follows. Here the cloning is unnecessary because we can use borrowings as keys of the HashSet.

#![allow(unused)]
fn main() {
fn foo(a: &Vec<String>) {
    let mut b = HashSet::new();
    for i in a {
        let c = i.clone();
        b.insert(c);
    }
}
}

Data collections

Sometimes there exists a better data collection with higher performance. RAPx will detect the pattern and give suggestions.

An example where HashSet is better than Vec is shown as follows.

#![allow(unused)]
fn main() {
fn foo(a: Vec<i32>, b: Vec<i32>) {
    for i in b.iter() {
        if a.contains(i) {}
    }
}
}

Usage

To detect such bugs, navigate to the project directory and execute the following command.

cargo rapx -opt

Chapter 6.5. Concurrency Bug Detection

TOBE ADDED (expected date: 2025-06-30)

Rust claims it has the feature of "Fearless Concurrency". However, it is still hard to prevent concurrency bugs in practice. This module is designed to detect potential cocurrency bugs.

Specifically, RAPx now focus on these types of concurrency bugs:

  • Lifetime-related bug on locks, i.e. unexpected double-lock or unexpected unlock
  • (To be added)

This module is still under development.

Usage

To detect concurrency bugs, use the command below in the same directory as cargo.toml

cargo rapx -conc

Chapter 7. Utilities

This chapter introduces some utilities and gatgets of RAPx, which help printing user-friendly debug messages or implement other functionality.

Chapter 7.1 A Qucik Start of Log APIs

Library Usage

We use annotate_snippets to render code snippets with underlines and messages. The document is available at annotate_snippets which is really easy to follow.

#![allow(unused)]
fn main() {
let message = Level::Error.title("expected type, found `22`").snippet( //the title of the message
    Snippet::source(source) //the source code
        .line_start(26) //the line number from which the target code starts
        .origin("examples/footer.rs") //the filename of the source code
        .fold(true) //whether to fold the source code or not
        .annotation(
            Level::Error //message level
                .span(193..195) //the underbound and the upperbound are offsets from the starting line
                .label("expected struct `annotate_snippets::snippet::Slice`, found reference"), //the annotation to explain the underlined code
        )
        .annotation(Level::Info.span(34..50).label("while parsing this struct")), //add annotations like a chain
);
}

line_start is the starting line number of the code snippet you are interested in. The content before this line will be dropped and never count in the offset again. The .span() accepts a Range object whose upperbound and underbound are offsets from the first character of the starting line. The corresponing code segment will be underlined and annotated.

Some utilities are available for you to better use these APIs.

Utilities to Deal with the Span Objects

  • span_to_source_code(span: Span) -> String: accepts a Span object and returns its corresponding source code.

  • span_to_first_line(span: Span) -> Span: accepts a Span object and returns the span corresponing to the first line of its source code or extends the span to a entire line.

  • span_to_trimmed_span(span: Span) -> Span: accepts a Span object and returns the span corresponding to head-trimmed source code.

  • span_to_filename(span: Span) -> String: accepts a Span object and returns the filename of the source code.

  • span_to_line_number(span: Span) -> usize: accepts a Span object and returns its line number in the source code.

  • relative_pos_range(span: Span, sub_span: Span) -> Range<usize>: accepts two Span objects and return the relative position range between the two spans.

An Example

fn report_used_as_immutable(graph: &Graph, clone_span: Span, use_span: Span)

We need to annotate two code spans i.e. clone_span and use_span. The source code and filename can be acquired from the APIs above.

let code_source = span_to_source_code(graph.span);
let filename = span_to_filename(clone_span); //other spans also work here

Because there is no relationship between the clone_span and the use_span. So we use graph.span, the span of the total function, as the code background. Corresponding to this, .line_start takes graph.span as its parameter. We use relative_pos_range to compute the relative poition range. Thus it takes graph.span, the background, and clone_span, the span to underline, as parameters.

let snippet = Snippet::source(&code_source)
    .line_start(span_to_line_number(graph.span))
    .origin(&filename)
    .fold(true)
    .annotation(
        Level::Error
            .span(relative_pos_range(graph.span, clone_span))
            .label("Cloning happens here."),
    )

Then we assemble the message with title and footer.

let message = Level::Warning
    .title("Unnecessary memory cloning detected")
    .snippet(snippet)
    .footer(Level::Help.title("Use borrowings instead."));

The final code looks like:

fn report_used_as_immutable(graph: &Graph, clone_span: Span, use_span: Span) {
    let code_source = span_to_source_code(graph.span);
    let filename = span_to_filename(clone_span);
    let snippet = Snippet::source(&code_source)
        .line_start(span_to_line_number(graph.span))
        .origin(&filename)
        .fold(true)
        .annotation(
            Level::Error
                .span(relative_pos_range(graph.span, clone_span))
                .label("Cloning happens here."),
        )
        .annotation(
            Level::Error
                .span(relative_pos_range(graph.span, use_span))
                .label("Used here"),
        );
    let message = Level::Warning
        .title("Unnecessary memory cloning detected")
        .snippet(snippet)
        .footer(Level::Help.title("Use borrowings instead."));
    let renderer = Renderer::styled();
    println!("{}", renderer.render(message));
}

And the message rendered looks like:

Exmaple of log usage