Heap-item Analysis (Type Analysis)
What is heap item?
Here we first give several definitions for further usage in this section. These definitions can been referred in the paper of rCanary
.
- Heap Item: A heap item is an object containing the heap allocation with a fixed type (layout).
- Heap-item Unit: A heap-item unit has a
PhantomData<T>
field with a typed generic parameter and a pointer field to store the valueT
. - Isolated Parameter: An isolated parameter is a stand-alone and typed generic parameter (e.g.,
T
but not&T
).
In the type analysis, we need to generate an abstraction for all rust types used in the local crate. Each type has a layout summary to illustrate the data abstraction in the fully initialized state.
The summary comprises two parts:
- a boolean value indicating if it contains a heap item;
- a boolean vector indicating if each generic parameter is an isolated parameter.
For example, the summary of the vector Vec<T,A>
is (1) [0, 1]
.
Overview
We give the overall framework of type analysis as below.
Then we will discuss the workflow of each component in this figure.
The visitor
method is main pass of the constructor part in type analysis, it will perform several important procedural
to determine whether an adt definition (adt-def) will occupy at least one heap allocation, reflecting holding heap-ownership
in RAP system.
From the top-down method of our approach, this visitor
is the set of several sub-phases which means it contains multiple sub-visitors to make whole method self.visitor()
work.
For example, given an AdtDef
(like Vec<T>
), the result of visitor
contains two parts:
- pt1 Enum:
{Owned / UnOwned}
indicates whether it will directly have a heap data - pt2 Array:
[bool;N]
indicates whether each generic parameter will have a raw param
Those 2 parts can accelerate heap-ownership inference in the data-flow analysis.
Type Collector
We use an inter-procedural visitor to collect all the types for local crate and the types are after monomorphism. Our goal is to extract all the AdtDef
s.
The struct TypeAnalysis
implements mir::Visitor
to simulate as the type collector.
#![allow(unused)] fn main() { pub struct TypeAnalysis<'tcx, 'a> { rcx: &'a mut RapGlobalCtxt<'tcx>, fn_set: Unique, ty_map: TyMap<'tcx>, adt_recorder: Unique, } }
Note: the type in this phase is
Ty::ty
rather ofHir::ty
.
As we can see, the function visit_ty
will flatten all the fields and generics appeared. This procedural can cover almost all the rust types.
#![allow(unused)] fn main() { impl<'tcx, 'a> TypeAnalysis<'tcx, 'a> { fn visit_ty(&mut self, ty: Ty<'tcx>, ty_context: TyContext) { match ty.kind() { TyKind::Adt(adtdef, substs) => { // ... for field in adtdef.all_fields() { self.visit_ty(field.ty(self.tcx(), substs) ,copy_ty_context(&ty_context)) } for ty in substs.types() { self.visit_ty(ty, copy_ty_context(&ty_context)); } }, TyKind::Array(ty, ..) => { self.visit_ty(*ty, ty_context); }, TyKind::Slice(ty) => { self.visit_ty(*ty, ty_context); }, TyKind::RawPtr(typeandmut) => { let ty = typeandmut.ty; self.visit_ty(ty, ty_context); }, TyKind::Ref(_, ty, ..) => { self.visit_ty(*ty, ty_context); }, TyKind::Tuple(tuple_fields) => { for field in tuple_fields.iter() { self.visit_ty(field, copy_ty_context(&ty_context)); } }, _ => return, } } } }
The AdtDef
are finally stored in the field adt_recorder
. After the collecting all AdtDef
, we have 4 phases to generate the summary for each type.
#![allow(unused)] fn main() { start_channel(|did| self.extract_isolaoted_parameter(did), &dids); start_channel(|did| self.extract_isolaoted_parameter_prop(did), &dids); start_channel(|did| self.extract_heap_item_unit(did), &dids); start_channel(|did| self.extract_owner_prop(did), &dids); }
Extract Isolated Parameters
Extract params in adt types, the param
means one generic parameter acting like T
, A
, etc...
In the sub-visitor IsolatedParameter
, it will visit the given type recursively, and extract all params to construct a mapping.
#![allow(unused)] fn main() { #[derive(Clone)] struct IsolatedParameter<'tcx> { tcx: TyCtxt<'tcx>, parameters: Parameters, } }
Note we only interest in isolated params
(T
not like *mut T
). It lies in one-entire field | recursive in tuple | recursive in array | mixed before.
The implementation can be found in function extract_raw_generic
and trait impl impl TypeVisitor for IsolatedParameter
.
Here, we give some examples of extracting isolated parameters. Given a struct Example<A, B, T, S>
:
#![allow(unused)] fn main() { struct Example<A, B, T, S> { a: A, b: (i32, (f64, B)), c: [[(S) ; 1] ; 2], d: Vec<T>, } }
The final result for <A, B, T, S>
is <true, true, false, true>
.
Propagate Isolated Parameter
Extract all params in the adt types like param T
and then propagate from the bottom to top.
This procedural is the successor of extracting isolated parameters, and the main idea of IsolatedParameterPropagation is to propagate params from bottom AdtDef
to the top as well as updating TypeAnalysis Context
.
Note that it will thorough consider mono-morphization existed in the AdtDef
. That means the type Vec<T>
, Vec<Vec<T>>
and Vec<i32>
are totally different!!!!
Given a struct Example<A, B, T, S>
:
#![allow(unused)] fn main() { struct X<A> { a: A, } struct Y<B> { a: (i32, (f64, B)), b: X<i32>, } struct Example<A, B, T, S> { a: X<A>, b: (i32, (f64, B)), c: [[(S) ; 1] ; 2], d: Vec<T>, } }
The final result for X<A>
is <true>
, for Y<B>
is <true>
, and for Example<A, B, T, S>
is <true, true, false, true>
.
Extract Heap-item Unit
Next, we need to extract all types that has PhantomData<T>
inside which T
must be a isolated parameter. Consider these types as a unit to guide the traversal over adt types.
As for one heap-item unit, only struct will contains the information that we want like
#![allow(unused)] fn main() { // Example: struct Foo<T> { NonNull<T>, // this indicates a pointer PhantomData<T>, // this indicates a ownership } }