Appendix C: Glossary

The compiler uses a number of...idiosyncratic abbreviations and things. This glossary attempts to list them and give you a few pointers for understanding them better.

TermMeaning
ASTthe abstract syntax tree produced by the syntax crate; reflects user syntax very closely.
bindera "binder" is a place where a variable or type is declared; for example, the <T> is a binder for the generic type parameter T in fn foo<T>(..), and |a| ... is a binder for the parameter a. See the background chapter for more
bound variablea "bound variable" is one that is declared within an expression/term. For example, the variable a is bound within the closure expression |a| a * 2. See the background chapter for more
codegenthe code to translate MIR into LLVM IR.
codegen unitwhen we produce LLVM IR, we group the Rust code into a number of codegen units. Each of these units is processed by LLVM independently from one another, enabling parallelism. They are also the unit of incremental re-use.
completenesscompleteness is a technical term in type theory. Completeness means that every type-safe program also type-checks. Having both soundness and completeness is very hard, and usually soundness is more important. (see "soundness").
control-flow grapha representation of the control-flow of a program; see the background chapter for more
CTFECompile-Time Function Evaluation. This is the ability of the compiler to evaluate const fns at compile time. This is part of the compiler's constant evaluation system. (see more)
cxwe tend to use "cx" as an abbreviation for context. See also tcx, infcx, etc.
DAGa directed acyclic graph is used during compilation to keep track of dependencies between queries. (see more)
data-flow analysisa static analysis that figures out what properties are true at each point in the control-flow of a program; see the background chapter for more
DefIdan index identifying a definition (see librustc/hir/def_id.rs). Uniquely identifies a DefPath.
Double pointera pointer with additional metadata. See "fat pointer" for more.
drop glue(internal) compiler-generated instructions that handle calling the destructors (Drop) for data types.
DSTDynamically-Sized Type. A type for which the compiler cannot statically know the size in memory (e.g. str or [u8]). Such types don't implement Sized and cannot be allocated on the stack. They can only occur as the last field in a struct. They can only be used behind a pointer (e.g. &str or &[u8]).
early-bound lifetimea lifetime region that is substituted at its definition site. Bound in an item's Generics and substituted using a Substs. Contrast with late-bound lifetime. (see more)
empty typesee "uninhabited type".
Fat pointera two word value carrying the address of some value, along with some further information necessary to put the value to use. Rust includes two kinds of "fat pointers": references to slices, and trait objects. A reference to a slice carries the starting address of the slice and its length. A trait object carries a value's address and a pointer to the trait's implementation appropriate to that value. "Fat pointers" are also known as "wide pointers", and "double pointers".
free variablea "free variable" is one that is not bound within an expression or term; see the background chapter for more
genericsthe set of generic type parameters defined on a type or item
HIRthe High-level IR, created by lowering and desugaring the AST (see more)
HirIdidentifies a particular node in the HIR by combining a def-id with an "intra-definition offset".
HIR MapThe HIR map, accessible via tcx.hir, allows you to quickly navigate the HIR and convert between various forms of identifiers.
ICEinternal compiler error. When the compiler crashes.
ICHincremental compilation hash. ICHs are used as fingerprints for things such as HIR and crate metadata, to check if changes have been made. This is useful in incremental compilation to see if part of a crate has changed and should be recompiled.
inference variablewhen doing type or region inference, an "inference variable" is a kind of special type/region that represents what you are trying to infer. Think of X in algebra. For example, if we are trying to infer the type of a variable in a program, we create an inference variable to represent that unknown type.
infcxthe inference context (see librustc/infer)
interninterning refers to storing certain frequently-used constant data, such as strings, and then referring to the data by an identifier (e.g. a Symbol) rather than the data itself, to reduce memory usage.
IRIntermediate Representation. A general term in compilers. During compilation, the code is transformed from raw source (ASCII text) to various IRs. In Rust, these are primarily HIR, MIR, and LLVM IR. Each IR is well-suited for some set of computations. For example, MIR is well-suited for the borrow checker, and LLVM IR is well-suited for codegen because LLVM accepts it.
IRLOIRLO or irlo is sometimes used as an abbreviation for internals.rust-lang.org.
itema kind of "definition" in the language, such as a static, const, use statement, module, struct, etc. Concretely, this corresponds to the Item type.
lang itemitems that represent concepts intrinsic to the language itself, such as special built-in traits like Sync and Send; or traits representing operations such as Add; or functions that are called by the compiler. (see more)
late-bound lifetimea lifetime region that is substituted at its call site. Bound in a HRTB and substituted by specific functions in the compiler, such as liberate_late_bound_regions. Contrast with early-bound lifetime. (see more)
local cratethe crate currently being compiled.
LTOLink-Time Optimizations. A set of optimizations offered by LLVM that occur just before the final binary is linked. These include optimizations like removing functions that are never used in the final program, for example. ThinLTO is a variant of LTO that aims to be a bit more scalable and efficient, but possibly sacrifices some optimizations. You may also read issues in the Rust repo about "FatLTO", which is the loving nickname given to non-Thin LTO. LLVM documentation: here and here
LLVM(actually not an acronym :P) an open-source compiler backend. It accepts LLVM IR and outputs native binaries. Various languages (e.g. Rust) can then implement a compiler front-end that output LLVM IR and use LLVM to compile to all the platforms LLVM supports.
memoizememoization is the process of storing the results of (pure) computations (such as pure function calls) to avoid having to repeat them in the future. This is typically a trade-off between execution speed and memory usage.
MIRthe Mid-level IR that is created after type-checking for use by borrowck and codegen (see more)
mirian interpreter for MIR used for constant evaluation (see more)
normalizea general term for converting to a more canonical form, but in the case of rustc typically refers to associated type normalization
newtypea "newtype" is a wrapper around some other type (e.g., struct Foo(T) is a "newtype" for T). This is commonly used in Rust to give a stronger type for indices.
NLLnon-lexical lifetimes, an extension to Rust's borrowing system to make it be based on the control-flow graph.
node-id or NodeIdan index identifying a particular node in the AST or HIR; gradually being phased out and replaced with HirId.
obligationsomething that must be proven by the trait system (see more)
pointused in the NLL analysis to refer to some particular location in the MIR; typically used to refer to a node in the control-flow graph.
projectiona general term for a "relative path", e.g. x.f is a "field projection", and T::Item is an "associated type projection"
promoted constantsconstants extracted from a function and lifted to static scope; see this section for more details.
providerthe function that executes a query (see more)
quantifiedin math or logic, existential and universal quantification are used to ask questions like "is there any type T for which is true?" or "is this true for all types T?"; see the background chapter for more
queryperhaps some sub-computation during compilation (see more)
regionanother term for "lifetime" often used in the literature and in the borrow checker.
riba data structure in the name resolver that keeps track of a single scope for names. (see more)
sessthe compiler session, which stores global data used throughout compilation
side tablesbecause the AST and HIR are immutable once created, we often carry extra information about them in the form of hashtables, indexed by the id of a particular node.
sigillike a keyword but composed entirely of non-alphanumeric tokens. For example, & is a sigil for references.
placeholderNOTE: skolemization is deprecated by placeholder a way of handling subtyping around "for-all" types (e.g., for<'a> fn(&'a u32)) as well as solving higher-ranked trait bounds (e.g., for<'a> T: Trait<'a>). See the chapter on placeholder and universes for more details.
soundnesssoundness is a technical term in type theory. Roughly, if a type system is sound, then if a program type-checks, it is type-safe; i.e. I can never (in safe rust) force a value into a variable of the wrong type. (see "completeness").
spana location in the user's source code, used for error reporting primarily. These are like a file-name/line-number/column tuple on steroids: they carry a start/end point, and also track macro expansions and compiler desugaring. All while being packed into a few bytes (really, it's an index into a table). See the Span datatype for more.
subststhe substitutions for a given generic type or item (e.g. the i32, u32 in HashMap<i32, u32>)
tcxthe "typing context", main data structure of the compiler (see more)
'tcxthe lifetime of the allocation arena (see more)
trait referencethe name of a trait along with a suitable set of input type/lifetimes (see more)
tokenthe smallest unit of parsing. Tokens are produced after lexing (see more).
TLSThread-Local Storage. Variables may be defined so that each thread has its own copy (rather than all threads sharing the variable). This has some interactions with LLVM. Not all platforms support TLS.
transthe code to translate MIR into LLVM IR. Renamed to codegen.
trait referencea trait and values for its type parameters (see more).
tythe internal representation of a type (see more).
UFCSUniversal Function Call Syntax. An unambiguous syntax for calling a method (see more).
uninhabited typea type which has no values. This is not the same as a ZST, which has exactly 1 value. An example of an uninhabited type is enum Foo {}, which has no variants, and so, can never be created. The compiler can treat code that deals with uninhabited types as dead code, since there is no such value to be manipulated. ! (the never type) is an uninhabited type. Uninhabited types are also called "empty types".
upvara variable captured by a closure from outside the closure.
variancevariance determines how changes to a generic type/lifetime parameter affect subtyping; for example, if T is a subtype of U, then Vec<T> is a subtype Vec<U> because Vec is covariant in its generic parameter. See the background chapter for a more general explanation. See the variance chapter for an explanation of how type checking handles variance.
Wide pointera pointer with additional metadata. See "fat pointer" for more.
ZSTZero-Sized Type. A type whose values have size 0 bytes. Since 2^0 = 1, such types can have exactly one value. For example, () (unit) is a ZST. struct Foo; is also a ZST. The compiler can do some nice optimizations around ZSTs.