Relink don't Rebuild
Metadata | |
---|---|
Point of contact | Jane Lusby |
Status | Proposed |
Tracking issue | |
Zulip channel | |
Teams | cargo, compiler |
Task owners | Ally Sommers, Piotr Osiewicz |
| cargo champion | Weihang Lo | | compiler champion | Oliver Scherer |
Summary
Work towards avoiding rebuilds of a crate's dependents for changes that don't affect the crate's public interface.
Motivation
Changing a comment, reordering use statements, adding a dbg!
statement to a non-inlinable
function, formatting code, or moving item definitions from one impl block to another
identical one all cause rebuilds of reverse dependencies of that crate.
This clashes with users' intuition for what needs to be rebuilt when certain changes are made and makes iterating more painful.
As a point of reference, in C and C++ – where there is a strict separation between interface and implementation in the form of header files – equivalent changes would only cause a rebuild of the translation unit whose source has been modified. For other units, existing compiler outputs would be reused (and re-linked into the final binary).
Our goal is to work towards making cargo
and rustc
smarter about when they do or don't need to
rebuild dependent crates (reverse dependencies).
The status quo
As an example, consider the rg
binary in the ripgrep
package.
Its crate dependency graph (narrowed to only include dependents of globset
, a particular
crate in ripgrep
's Cargo workspace) looks like this:
❯ cargo tree --invert globset
globset v0.4.16
├── grep-cli v0.1.11
│ └── grep v0.3.2
│ └── ripgrep v14.1.1
└── ignore v0.4.23
└── ripgrep v14.1.1
flowchart TB globset grep-cli grep ignore ripgrep globset --> ignore --> ripgrep globset --> grep-cli --> grep --> ripgrep
Consider a change that does not alter the interface of the globset
crate (for example,
modifying a private item or changing a comment within globset
's source code).
Here is the output of cargo build --timings
for an incremental build of ripgrep
where only
such a change was made to globset
:
Ideally, in this scenario, the transitive dependents of globset
(that only depend on
globset
's "interface") would not need to be rebuilt. This would allow us to skip the
grep-cli
, ignore
, grep
, and ripgrep
re-compiles and only redo linking of the final
binary ("relink, don't rebuild")1.
For smaller/shallow dep graphs (like the above) the extra rebuilds are tolerable, but for deeper graphs, these rebuilds significantly impact edit-debug cycle times.
Transitive Deps and the Build System View
Ideally the crate-level dependency graph above would (morally) correspond to a build graph like this2:
flowchart TB subgraph globset[globset compile] globset.rmeta:::rmeta globset.rlib:::rlib end subgraph grep-cli[grep-cli compile] grep-cli.rmeta:::rmeta grep-cli.rlib:::rlib end subgraph grep[grep compile] grep.rmeta:::rmeta grep.rlib:::rlib end subgraph ignore[ignore compile] ignore.rmeta:::rmeta ignore.rlib:::rlib end subgraph ripgrep[ripgrep compile] %% ripgrep.rmeta:::rmeta ripgrep.rlib:::rlib end ripgrep_bin["`rg (bin)`"] classDef rmeta fill:#ea76cb classDef rlib fill:#2e96f5 %% linker inputs (`rlib`s): globset.rlib & grep-cli.rlib & grep.rlib & ignore.rlib & ripgrep.rlib -.-> ripgrep_bin %% direct deps (`rmeta`s): globset.rmeta --> ignore ignore.rmeta --> ripgrep globset.rmeta --> grep-cli grep-cli.rmeta --> grep grep.rmeta --> ripgrep
In particular, note that crate compiles use the rmeta
s of their direct dependencies.
However, in reality crate compiles need access to all transitive rmeta
s:
flowchart TB subgraph globset[globset compile] globset.rmeta:::rmeta globset.rlib:::rlib end subgraph grep-cli[grep-cli compile] grep-cli.rmeta:::rmeta grep-cli.rlib:::rlib end subgraph grep[grep compile] grep.rmeta:::rmeta grep.rlib:::rlib end subgraph ignore[ignore compile] ignore.rmeta:::rmeta ignore.rlib:::rlib end subgraph ripgrep[ripgrep compile] %% ripgrep.rmeta:::rmeta ripgrep.rlib:::rlib end ripgrep_bin["`rg (bin)`"] classDef rmeta fill:#ea76cb classDef rlib fill:#2e96f5 %% linker inputs (`rlib`s): globset.rlib & grep-cli.rlib & grep.rlib & ignore.rlib & ripgrep.rlib -.-> ripgrep_bin %% direct deps (`rmeta`s): globset.rmeta --> ignore ignore.rmeta --> ripgrep globset.rmeta --> grep-cli grep-cli.rmeta --> grep grep.rmeta --> ripgrep %% transitive deps (`rmeta`s): globset.rmeta ==> ripgrep & grep grep-cli.rmeta ==> ripgrep
This means that when a crate's rmeta
changes, the rustc
invocations corresponding to all
transitive dependents of that crate are rerun (even if intermediate rmeta
s are the
same).
More concretely: when globset.rmeta
changes, grep
is rebuilt – even if grep-cli.rmeta
(after grep-cli
is re-compiled) hasn't changed.
The fact that crate compiles depend on the rmeta
s for all transitive dependencies is
significant because it inhibits our ability to get "early cutoff" (ECO). In reality, crates
compiles are only actually sensitive to the subset of their transitive deps exposed via
their direct deps but under this view (file-level, in the eyes of the build system) crates are
sensitive to transitive dependencies in their entirety.
More concretely: the grep
crate is only sensitive to the parts of globset
accessible via
grep_cli
– if a change is made to globset
that doesn't affect this subset, we'd expect to
see grep_cli
being rebuilt but the existing grep
outputs being reused (no grep
rebuild).
"Early cutoff" (ECO) refers to a build system optimization where we are able to detect that a freshly-built artifact is identical to a prior one and to then reuse existing artifacts of dependent crates from then on (instead of continuing to rebuild them).
The next 6 months
- Identify and remove "oversensitivity" in
.rmeta
- i.e. changes to spans, comments, etc. will not affect the
.rmeta
- coupled with cargo's unstable
checksum-freshness
feature, this would avoid triggering rebuilds for dependent crates
- i.e. changes to spans, comments, etc. will not affect the
- Make
DefId
s more stable when items are added or reordered- today this is a major source of differences in compiler output
- there are other things like
SymbolIndex
es which we may also want to stabilize
- Work on designs for enabling "transitive" ECO
- i.e. the decision to rebuild should factor in what parts of a transitive crate dep are accessible via direct deps
The "shiny future" we are working towards
Only changes to a crate that affect the public interface of the crate should cause downstream crates to rebuild.
Ownership and team asks
Task | Owner(s) or team(s) | Notes |
---|---|---|
Design meeting | ||
Discussion and moral support | ||
Nightly experiment for RDR | ||
↳ Author MCP | Piotr Osiewicz | already accepted |
↳ Rustc Implementation | WIP | |
↳ Cargo Implementation | WIP | |
Improve DefId stability | Ally Sommers | |
Standard reviews |
Definitions
Definitions for terms used above:
- Discussion and moral support is the lowest level offering, basically committing the team to nothing but good vibes and general support for this endeavor.
- Design meeting means holding a synchronous meeting to review a proposal and provide feedback (no decision expected).
- Standard reviews refers to reviews for PRs against the repository; these PRs are not expected to be unduly large or complicated.
- Other kinds of decisions:
- Compiler Major Change Proposal (MCP) is used to propose a 'larger than average' change and get feedback from the compiler team.
Frequently asked questions
Isn't rustc
incremental enough?
Theoretically, yes: under a system like rust-analyzer
where there isn't chunking of work along
crate/file/process invocation boundaries, incremental compilation would obviate this effort.
However under rustc
's current architecture (1 process invocation per crate, new process
invocation for each compile rather than a daemon): RDR (i.e. being able to skip rustc
invocations) still matters.
Right now even when 100% of a compile's incremental queries hit the cache (such as when you
touch
a source file; i.e. incr-unchanged
) it still takes
non-negligible amounts of time to replay those queries and re-emit compiler outputs (see
zulip thread).
-
cargo --timings
output does not currently differentiate between time spent compiling (i.e. producing therlib
for) and linking the final binary (rg
); therg
bar covers time spent for both ↩ -
We have taken some liberties in the above graph w.r.t. pipelining.
Today,cargo
preforms a singlerustc
invocation to produce therlib
andrmeta
for each crate –rmeta
is modeled as an "early out".
Additionally, producingripgrep.rlib
and linking (therg (bin)
node) happens as part of a singlerustc
invocation. ↩