T-compiler Meeting Agenda 2022-06-02
Announcements
- Tomorrow <time:2022-06-03T10:00:00-04:00 Types Team: Planning/Deep-Dive calendar link
- Reminder: if you see a PR/issue that seems like there might be legal implications due to copyright/IP/etc, please let the Core team know (or at least message @pnkfelix or @Wesley Wiser so we can pass it along).
- Stable MIR MVP: https://github.com/rust-lang/rust/pull/97385
- basically just a single entry point for all external users of MIR
- expected to evolve to wrap common APIs with less-frequently changing ones.
- no path towards actual stabilization yet
Other WG meetings (calendar link)
- time:2022-06-06T16:00:00+02:00 wg-debugging status & design meeting
- time:2022-06-06T17:00:00+02:00 wg-rls-2.0 steering meeting
- time:2022-06-06T17:30:00+02:00 Async WG triage meeting
- time:2022-06-06T22:00:00+02:00 [wg-traits] GATs Sync
- time:2022-06-15T15:00:00+02:00 [Types team] Hack session: Advanced subtyping
MCPs/FCPs
- New MCPs (take a look, see if you like them!)
- No new proposals this time.
- Old MCPs (not seconded, take a look)
- “Tier 3 target proposal: riscv64gc-linux-android (Android target for
riscv64gc
)” compiler-team#472 (last review activity: 5 months ago)- :loudspeaker: Stale MCP: candidate for closing
- Zulip discussion
- MCP author replied to all questions, so it seems no visible concern to be addressed
- “
-Dwarnings
to cover all warnings” compiler-team#473 (last review activity: 5 months ago) - “Dealing with type/const ambiguities” compiler-team#480 (last review activity: 4 months ago)
- “Removing codegen logic for
nvptx-nvidia-cuda
(32-bit target)” compiler-team#496 (last review activity: 2 months ago) - “Arbitrary annotations in compiletest” compiler-team#513 (last review activity: about 24 days ago)
- “Tier 3 target proposal: riscv64gc-linux-android (Android target for
- Pending FCP requests (check your boxes!)
- “Stabilize
-Zgcc-ld=lld
as-Clink-self-contained=linker -Clinker-flavor=gcc-lld
” compiler-team#510 - “Increase the minimum linux-gnu versions” rust#95026
- Note: T-libs discussed and approved this change ([comment])(https://github.com/rust-lang/rust/pull/95026#issuecomment-1144222893)
- “Stabilize
- Things in FCP (make sure you’re good with it)
- “Build-time execution sandboxing” compiler-team#475
- “Change compiletest declarations parsing” compiler-team#512
- Accepted MCPs
- No new accepted proposals this time.
- Finalized FCPs (disposition merge)
- “Tracking Issue for
-Z terminal-width
” rust#84673 - “Remove migrate borrowck mode” rust#95565
- “Stabilize the
bundle
native library modifier” rust#95818 - “Modify MIR building to drop repeat expressions with length zero” rust#95953
- “Remove label/lifetime shadowing warnings” rust#96296
- “Tracking Issue for
WG checkins
- @_WG-traits (impl trait) by @nikomatsakis @oli (prev. checkin)
- fix soundness bug where a TAIT could imply a lifetime bound that doesn’t hold
- anonymous lifetimes on TAITs mentioned in function args can be connected to anonymous output lifetimes
- fix a bunch of ICEs that occurred because closures could now be malformed for certain params of a TAIT
- diagnostic improvement: hidden types are checked to be wf for the TAIT at definition site of each hidden type, not once for the opaque type
- soundness fix: no TAIT in impl block headers
- merge all defining uses of a TAIT within a function, instead of merging them at the end with other uses
- next steps in flight are
- @_WG-llvm by @nagisa and @Nikita Popov (prev. checkin)
- Last bits for opaque pointer support in rust have landed (https://github.com/rust-lang/rust/pull/94214) and I’ve enabled them in LLVM today, so we should be able to pick this up with the LLVM 15 upgrade without further changes.
- There’s a proposal for first-class aborting unwind support (https://discourse.llvm.org/t/rfc-add-call-unwindabort-to-llvm-ir/62543), that will likely be useful for rust as well.
- There’s a PR for virtual function elimination with LTO (https://github.com/rust-lang/rust/pull/96285). Though looks like it’s not entirely sound at the moment.
Backport nominations
T-compiler beta / T-compiler stable
- :beta: “don’t do
Sized
and other return type checks on RPIT’s real type” rust#97431- nominated by @oli
- fixes rust#97226, ICE when doing
Sized
check against the RPIT’s real type
- :beta: “Fix indices and remove some unwraps in arg mismatch algorithm” rust#97557
- patch authored and nominated for backport by @Michael Goulet (compiler-errors)
- Fixes rust#97484 (ICE when parsing args)
- not yet merged (cc @Jack Huey for review)
T-rustdoc beta / T-rustdoc stable
- No backport nominations for
T-rustdoc
this time.
:back: / :shrug: / :hand:
PRs S-waiting-on-team
- No PRs waiting on
T-compiler
this time.
Oldest PRs waiting for review
- “Add option to pass environment variables” rust#94387 (last review activity: about 33 days ago)
- cc latest reviewers: @Esteban Küber and @bjorn3
- “
Clone
suggestions” rust#95115 (last review activity: about 26 days ago)- assigned reviewer @Esteban Küber which suggested an additional reviewer (comment)
- also cc: @oli (as they left comments, too)
- “Add Finalize statement to make deaggregation “reversible” by storing all information in MIR” rust#96043 (last review activity: about 25 days ago)
- maybe a comment on perf run results? maybe cc: @Wesley Wiser
- “Remove opaque types from typeck expectations” rust#96727 (last review activity: about 25 days ago)
- assigned for review to @lcnr
- “Print type of every call in a method call chain” rust#96918 (last review activity: about 20 days ago)
- cc: @Michael Goulet (compiler-errors) maybe for comment on latest author comment?
Issues of Note
Short Summary
- 2 T-compiler P-critical issues
- 62 T-compiler P-high issues
- 0 P-critical, 0 P-high, 4 P-medium, 0 P-low regression-from-stable-to-beta
- 0 P-critical, 0 P-high, 2 P-medium, 0 P-low regression-from-stable-to-nightly
- 1 P-critical, 39 P-high, 77 P-medium, 9 P-low regression-from-stable-to-stable
P-critical
- “Infinite recursion optimized away with
opt-level=z
” rust#97428- @Nikita Popov already provided patch LLVM upstream and backport for our LLVM fork (comment)
- “Wrong cast of u16 to usize on aarch64” rust#97463
- very old unsoundness issue (comment) on tier 1 platform (tier support list), surfaced just now
- No
P-critical
issues forT-rustdoc
this time.
P-high regressions
- No
P-high
beta regressions this time.
Unassigned P-high nightly regressions
- No unassigned
P-high
nightly regressions this time.
Performance logs
A good week: The regressions were small; some have follow-up PR’s in flight to
address them; and we saw a big improvement from PR
#97345, which adds more fast
paths for quickly exiting comparisons between two types (such as BitsImpl<M>
and BitsImpl<N>
for const integers M
and N
). This improved compile-times
for the bitmaps
benchmark by 50-65% in some cases (including the trunk
nalgebra
, according to independent investigation from nnethercote). That same
PR had more modest improvements (1% to 2%) to the compile-times for a number of
other crates. Many thanks to lcnr and nnethercote for some excellent work here!
Triage done by @pnkfelix. Revision range: 43d9f385..0a43923a
Summary:
mean | max | count | |
---|---|---|---|
Regressions (primary) | N/A | N/A | 0 |
Regressions (secondary) | 0.8% | 0.9% | 3 |
Improvements (primary) | -4.0% | -65.9% | 227 |
Improvements (secondary) | -2.0% | -7.7% | 217 |
All (primary) | -4.0% | -65.9% | 227 |
3 Regressions, 1 Improvements, 9 Mixed; 0 of them in rollups 59 artifact comparisons made in total 24 Untriaged Pull Requests
Regressions
Proc macro tweaks #97004 (Comparison Link)
mean | max | count | |
---|---|---|---|
Regressions (primary) | 0.2% | 0.2% | 1 |
Regressions (secondary) | 0.6% | 1.5% | 10 |
Improvements (primary) | N/A | N/A | 0 |
Improvements (secondary) | N/A | N/A | 0 |
All (primary) | 0.2% | 0.2% | 1 |
- aparently making a
Buffer<T>
non-generic (its solely instantiated atu8
) caused a performance regression. - there is ongoing follow-up work to address the regression, either by making the buffer generic again, or by adding
#[inline]
annotations. - see e.g. PR 97539
rustdoc: include impl generics / self in search index #96652 (Comparison Link)
mean | max | count | |
---|---|---|---|
Regressions (primary) | 0.5% | 1.0% | 3 |
Regressions (secondary) | 1.1% | 1.1% | 3 |
Improvements (primary) | N/A | N/A | 0 |
Improvements (secondary) | N/A | N/A | 0 |
All (primary) | 0.5% | 1.0% | 3 |
- This regressed doc-generation for a few primary benchmarks, but I think that might be a inherent cost of a change like this. I marked it as triaged based on that assumption.
Move things to rustc_type_ir
#97287 (Comparison Link)
mean | max | count | |
---|---|---|---|
Regressions (primary) | 0.5% | 0.7% | 9 |
Regressions (secondary) | 1.1% | 1.1% | 3 |
Improvements (primary) | N/A | N/A | 0 |
Improvements (secondary) | N/A | N/A | 0 |
All (primary) | 0.5% | 0.7% | 9 |
- During development, this PR was identified as regressing one primary benchmark,
bitmaps
(in a variety of contexts), and the PR author said better to take that perf hit and deal with it later. - In the PR that landed, the number of regressing variants was smaller, but the affected set of benchmarks slightly larger: both
bitmaps
andunicode-normalization
are affected, regressing by 0.35% to 0.70%. - My very brief inpsection of the flamegraphs (old, new) didn’t show any smoking guns.
- At this point I think we can just live with this performance hit.
Improvements
Add suggestion for relaxing static lifetime bounds on dyn trait impls in NLL #97284 (Comparison Link)
mean | max | count | |
---|---|---|---|
Regressions (primary) | N/A | N/A | 0 |
Regressions (secondary) | N/A | N/A | 0 |
Improvements (primary) | -0.2% | -0.2% | 1 |
Improvements (secondary) | -5.3% | -5.9% | 6 |
All (primary) | -0.2% | -0.2% | 1 |
- This managed to improve performance (slightly), probably because it restructured the
CallArgument
and that had fallout that ended up being positive overall (despite our intuition being that it would hurt performance due to the increase in size). - It might be nice to confirm that hypothesis independently (by isolating just that structural change and confirming that it has a similar effect on performance here) …
- … but, the improvements are essentially isolated to just the secondary wg-grammar benchmark, so its not really worth too much digging, except if we think it might reveal other structural changes we should make elsewhere.
Mixed
add a deep fast_reject routine #97345 (Comparison Link)
mean | max | count | |
---|---|---|---|
Regressions (primary) | N/A | N/A | 0 |
Regressions (secondary) | 0.9% | 1.6% | 10 |
Improvements (primary) | -11.0% | -65.8% | 45 |
Improvements (secondary) | -0.5% | -0.9% | 12 |
All (primary) | -11.0% | -65.8% | 45 |
- this was great, as noted in summary.
Move various checks to typeck so them failing causes the typeck result to get tainted #96046 (Comparison Link)
mean | max | count | |
---|---|---|---|
Regressions (primary) | N/A | N/A | 0 |
Regressions (secondary) | 0.9% | 1.1% | 4 |
Improvements (primary) | -0.3% | -0.4% | 4 |
Improvements (secondary) | -0.3% | -0.3% | 24 |
All (primary) | -0.3% | -0.4% | 4 |
- This had some small improvements to primary benchmarks.
- The CTFE stress test regressed, but I assume that was expected since this was a change to the CTFE engine (to address some ICE’s).
Update jemalloc to v5.3 #96790 (Comparison Link)
mean | max | count | |
---|---|---|---|
Regressions (primary) | N/A | N/A | 0 |
Regressions (secondary) | 0.3% | 0.3% | 1 |
Improvements (primary) | -0.9% | -6.2% | 212 |
Improvements (secondary) | -1.1% | -3.2% | 191 |
All (primary) | -0.9% | -6.2% | 212 |
- This had various improvements to our primary benchmarks’ instruction counts.
- The other big thing to observe: the max-rss was improved in several primary benchmarks (by 1% to 6%), and some secondary benchmarks saw even more significant improvements to their max-rss.
Split dead store elimination off dest prop #97158 (Comparison Link)
mean | max | count | |
---|---|---|---|
Regressions (primary) | 0.5% | 1.3% | 15 |
Regressions (secondary) | 0.6% | 2.3% | 12 |
Improvements (primary) | -0.4% | -1.9% | 50 |
Improvements (secondary) | -0.6% | -1.3% | 33 |
All (primary) | -0.2% | -1.9% | 65 |
- The changes here were investigated by the PR author.
- Marking as triaged based on their investigation.
Try to cache region_scope_tree as a query #97383 (Comparison Link)
mean | max | count | |
---|---|---|---|
Regressions (primary) | N/A | N/A | 0 |
Regressions (secondary) | 1.0% | 1.0% | 3 |
Improvements (primary) | -1.7% | -4.7% | 98 |
Improvements (secondary) | -2.4% | -7.7% | 43 |
All (primary) | -1.7% | -4.7% | 98 |
- This was a targeted PR to address the regressions introduced by PR #95563 (Comparison Link).
- It seems to succeed at recovering the instruction-count performance lost in that PR.
- It doesn’t quite recover all of the max-rss lost (PR #97383 improves max-rss by about 3.75% while PR #95563 degraded max-rss by about 4%), but its close enough to satisfy our needs.
proc_macro: don’t pass a client-side function pointer through the server. #97461 (Comparison Link)
mean | max | count | |
---|---|---|---|
Regressions (primary) | 0.5% | 0.5% | 2 |
Regressions (secondary) | N/A | N/A | 0 |
Improvements (primary) | -0.4% | -0.8% | 11 |
Improvements (secondary) | -2.6% | -5.4% | 10 |
All (primary) | -0.3% | -0.8% | 13 |
- On the primary benchmarks, this mostly yielded small improvements; the one exception was
serde_derive
(debug), which regressed by ~0.5% for the full and incr-full variants. - Looking at the flamegraphs for the before and after commits, and at the table at bottom of details page, it seems like the instruction-count regression is in
codegen_module
- Looking at the history of serde_derive-debug (massively zoomed in on the “Percent Delta from First” Graph kind), it seems reasonable to think that something did happen on this PR.
- …. but I also don’t really think its a big enough regression to be worth tearing our hair out over. This is a (smallish) win overall, and even for serde_derive-debug, it is a small regression in the context of much larger wins, so overall the trajectory is good.
- Marked as triaged.
Replace #[default_method_body_is_const]
with #[const_trait]
#96964 (Comparison Link)
mean | max | count | |
---|---|---|---|
Regressions (primary) | 0.2% | 0.3% | 9 |
Regressions (secondary) | 0.2% | 0.3% | 3 |
Improvements (primary) | N/A | N/A | 0 |
Improvements (secondary) | -1.1% | -1.1% | 3 |
All (primary) | 0.2% | 0.3% | 9 |
- The primary regressions are mostly in variants of stm32f4, with a two serde and one syn thrown in for good measure.
- The improvements are solely to ctfe stress test, which I guess makes sense given the PR?
- Anyway, the regressions seem minor, and they are contained to an unstable feature that the stdlib is using.
- Marking as triaged.
improve format impl for literals #97480 (Comparison Link)
mean | max | count | |
---|---|---|---|
Regressions (primary) | 0.4% | 0.5% | 4 |
Regressions (secondary) | N/A | N/A | 0 |
Improvements (primary) | N/A | N/A | 0 |
Improvements (secondary) | -1.5% | -1.5% | 1 |
All (primary) | 0.4% | 0.5% | 4 |
- This is adding a fast-path so that
format!("literal")
will compile into the same code as"literal".to_owned()
. - The primary regression is solely contained to
bitmaps
. - Its possible that the regression to
bitmaps
is due toformat!("literal")
being totally unused in that code; all instances offormat!
there take an additional argument. So its possible that the extra code to check about whether to use the fast-path is slowing things down there. - But I personally don’t believe that explanation here: Unless I’m misunderstanding the code, there is some amount of macro-expansion into multiple instances of
format!
, but most of the expanded code is going to be dominated by all theimpl
blocks, not the relatively fewformat!
instances. (Unless I massively misunderstand how the macros and/or codegen and/or inlining end up linking up here.) - So: I don’t believe the best hypothesis I have for what is happening here.
- But I also do not think the regression here is large enough to warrant further investigation.
- Marking as triaged.
errors: simplify referring to fluent attributes #97357 (Comparison Link)
mean | max | count | |
---|---|---|---|
Regressions (primary) | N/A | N/A | 0 |
Regressions (secondary) | 0.4% | 0.6% | 9 |
Improvements (primary) | N/A | N/A | 0 |
Improvements (secondary) | -0.4% | -0.5% | 4 |
All (primary) | N/A | N/A | 0 |
- There were some regressions here, but they are few, minor, and contained solely to secondary benchmarks (specifically projection-caching and wg-grammar). Marking as triaged.
Nominated Issues
- “Update Mac Catalyst support for Clang 13” rust#96392
- No new RFC to be discussed
Next week’s WG checkins
- @_WG-traits Traits (generic work of the WG) by @nikomatsakis and @Jack Huey
- @_WG-mir-opt MIR Optimizations by @oli