- Feature Name:
target_modifiers
- Start Date: 2024-10-24
- RFC PR: rust-lang/rfcs#3716
- Rust Issue: None
Summary
- We introduce the concept of “target modifier”. A target modifier is a flag where it may be unsound if you link together two compilation units that disagree on the flag.
- We fail the build if rustc can see two Rust compilation units that do not agree on the exact set of target modifier flags.
- There are already several existing flags that could fall into this category. There are also hypothetical new flags that do.
- The error can be silenced using the
-Cunsafe-allow-abi-mismatch
escape hatch. - Not having a stable way to build stdlib crates does not block stabilization of target modifiers.
- As a future extension we may be able to relax the rules to allow some specific kinds of mismatches.
- This RFC does not stabilize any target modifiers. That should happen in follow-up MCPs/FCPs/RFCs/etc.
Motivation
As Rust expands into low-level domains, there will be a need for precise control over how code is compiled. This often manifests as a new compiler flag. Some of these flags trigger undefined behavior if used incorrectly, which is in tension with Rust’s safety goals. This RFC proposes a new mechanism to allow use of such flags while also preventing undefined behavior.
The primary goal of this RFC is to unblock stabilization of target modifier flags. Adding them as unstable (and unsound) flags is already happening today without this RFC.
The Linux Kernel
The Linux Kernel has run into a handful of cases where it is necessary to tweak the ABI used in the kernel. Often, this is done conditionally depending on a configuration option. A few examples:
- When using
CONFIG_SHADOW_CALLSTACK
the x18 register must be reserved in the ABI with-Zfixed-x18
. - The
-Ctarget-feature=-neon
flag is used to prevent use of floating points on arm. - On 32-bit x86,
-Zreg-struct-return
and-Zregparm=3
are used. - When using
CONFIG_CFI_CLANG
the kCFI sanitizer is enabled with-Zsanitizer=kcfi
. Unlike most other sanitizers, this sanitizer is used in production. - In several different cases,
-Zpatchable-function-entry
is used to add nops before or after the function entrypoint. When mixed with-Zsanitizer=kcfi
this causes special considerations as kcfi works by placing a tag before the function entrypoint. - To support
MITIGATION_RETPOLINE
andMITIGATION_SLS
,target.json
is used on x86.
We expect there to be more examples in the future.
Sanitizers
There is an ongoing effort to stabilize some of the sanitizers. However, this effort explicitly aims to stabilize sanitizers that can be used without rebuilding the stdlib. With this RFC, that is no longer a blocker as the remaining sanitizers can be classified as a target modifier and then stabilized.
Embedded Targets
Currently, embedded platforms such as thumb*
or rv*
use separate targets
for configuration with significant ABI changes. For thumb*
targets, this is
currently limited to Hard- vs Soft-Float which can cause issues when linked.
However for RISC-V targets, the F (32-bit hardware float), D (64-bit hardware
float), and Q (128-bit hardware float) extensions all can potentially change
the ABI, which would increase the number of required targets. The
E extension may also change the ABI by limiting the number of
registers used by the I (integer operations) extension.
Existing -C flags that are unsound
It has recently been discovered that several existing -C
flags modify the
ABI, making them unsound. Examples:
-Csoft-float
-Ctarget-feature=-neon
-Clinker-plugin-lto
-Cllvm-args
- Possibly
-Ccode-model
and-Crelocation-model
This problem is a new discovery and it’s still not clear how to solve it. Target modifiers will not solve all of the flags; some flags are just unfixable and need to be removed. But I expect that target modifiers will be the solution for some of these flags.
Guide-level explanation
The Rust compiler has many flags that affect how source code is turned into
machine code. Some flags can be turned on and off for each CU (compilation
unit) in your project separately, and other flags must be applied to the entire
application as a whole. The typical reason for flags to be in the latter
category is that change some aspect of the ABI. For example,
-Zreg-struct-return
changes how to return a struct from a function call, and
both the caller and callee must agree on how to do that even if they are in
different CUs.
The Rust compiler will detect if you incorrectly use a flag that must be
applied to the application as a whole. For example, if you compile the standard
library with -Zreg-struct-return
, but don’t pass the flag when compiling a
dependency, then you will get the following error:
error: mixing -Zreg-struct-return will cause an ABI mismatch
help: This error occurs because the -Zreg-struct-return flag modifies the ABI,
and different crates in your project were compiled with inconsistent
settings.
help: To resolve this, ensure that -Zreg-struct-return is set to the same value
for all crates during compilation.
help: To ignore this error, recompile with the following flag:
-Cunsafe-allow-abi-mismatch=reg-struct-return
As an escape hatch, you can use -Cunsafe-allow-abi-mismatch=reg-struct-return
to disable the error. Using this flag is unsafe as incorrect use of the ABI is
undefined behavior. However, there may be some cases where the check is too
strict, and you can use the flag to proceed in those cases.
The requirement that all CUs agree includes stdlib crates (core, alloc, std),
so using these flags usually requires that you compile your own standard
library with -Zbuild-std
or by directly invoking rustc
. That said, some
flags (e.g., -Cpanic
) have mechanisms to avoid this requirement.
Reference-level explanation
A compiler flag can be classified as a target modifier. When a flag is a target modifier, it can be undefined behavior to link together two CUs that disagree on the flag.
To avoid unsoundness from mixing target modifiers, rustc will store the set of target modifiers in use in the crate metadata of each crate. Whenever rustc is invoked, it will inspect the crate metadata of all crates that are visible to it (usually the current crate and its direct dependencies) and emit an error if any of them have mismatched target modifiers in use.
The -Cunsafe-allow-abi-mismatch=flagname
flag can be used when compiling a
crate to indicate that it should not be included in the list of crates when
checking that all crates agree on flagname
.
Note that -Cunsafe-allow-abi-mismatch=flagname
should be passed to rustc when
compiling the crate that uses an incompatible value for flagname
, which may
not be the same rustc invocation as the one where the mismatch is detected. For
example, if you build four CUs A,B,C,D where D depends on A,B,C and C is the
only one with a different value for flagname
, then the mismatch is detected
by rustc when compiling D, but -Cunsafe-allow-abi-mismatch
should be used
when compiling C.
Stabilization
It is possible to stabilize target modifiers even if they cannot be utilized
without an unstable feature such as -Zbuild-std
.
Drawbacks
Teaching
We should be careful to not introduce too many concepts that end-users have to learn.
It is intentional that the guide-level section of this RFC does not use the
word “target modifier”. The “target modifier” name is not intended to be used
outside of the compiler internals and very technical documentation. Compiler
errors should not say “error: trying to mix target modifiers” or something like
that; rather the error should just say that mixing -Cfoo
may cause ABI
issues.
For similar reasons, the flag for silencing the error is called
-Cunsafe-allow-abi-mismatch
with the word “ABI” to avoid having to teach the
user about mismatched flags or target modifiers.
Rationale and alternatives
Why not just add flags like normal?
Preventing undefined behaviour is an important goal of the Rust project. If we add flags that change the abi, then that is in direct opposition to that goal, as mixing them would lead to UB.
Why not just add new targets?
The flag that started this entire discussion is -Zfixed-x18
. This flag
changes the ABI by changing the x18 register from a caller-saved temporary
register to a reserved register. At the time, people suggested adding a new
target (e.g., aarch64-unknown-none-fixed18
), instead of adding a dedicated
-Zfixed-x18
flag.
The primary benefit of adding a new target is that it’s a workaround for
-Zbuild-std
being unstable. Each new target will get a prebuilt stdlib, which
sidesteps the need for building your own stdlib.
This RFC does not propose this solution because:
- The primary benefit is not being blocked on stabilization of
-Zbuild-std
. However, I don’t think we really are blocked on stabilization of-Zbuild-std
in the first place. See the stabilization of target modifier flags section below. - Target modifiers help with other problems such as unblocking the
stabilization of sanitizers as well as existing
-C
flags that are unsound due to ABI issues. Adding new targets would leave these issues unsolved. - Adding new targets risks an exponential number of targets. In the kernel on x86 we would need 8 different targets to support the different possible kernel configurations. It’s not hard to imagine that number growing to 16 or 32 targets in the near future, especially once you consider that other embedded projects may have their own set of target modifier flags.
- Adding new prebuilt stdlibs does not actually help the projects that need
these flags. Even if a prebuilt stdlib is provided for every combination of
ABI-affecting flags that the kernel may need, the kernel has other reasons
that require building a custom
core
.
Why not use target.json
Because the target.json
feature is perma-unstable, and this RFC primarily
concerns itself with unblocking the stabilization of these flags. Adding
target modifiers as unstable flags is already happening today. (However, if
this RFC gets accepted, it becomes a soundness bug to add such unstable flags
without wiring them up with the target modifier machinery.)
One possible alternative would be to stabilize a subset of target.json
.
However, I don’t think there’s much benefit to this. It just means that you now
have to learn two different ways of passing flags to the compiler. See the
Teaching section above.
It would also be inconvenient to use in external build systems. Right now, the
kernel passes the -Zfixed-x18
flag like this:
ifeq ($(CONFIG_SHADOW_CALL_STACK), y)
KBUILD_CFLAGS += -ffixed-x18
KBUILD_RUSTFLAGS += -Zfixed-x18
endif
If -Zfixed-x18
had to be specified in a target.json
file, it would need to
happen in an entirely different part of the kernel build system. It is better
to specify the rustc flag together with the clang/gcc flag.
Stabilization of target modifiers
Using a target modifier without rebuilding the Rust stdlib is often not
possible. This means that some target modifiers can only be used in tandem with
-Zbuild-std
, which is currently unstable.
However, there’s no reason we have to block the stabilization of target
modifiers on the stabilization of -Zbuild-std
. If a target modifier -Cfoo
is stabilized, then you can break users of -Cfoo
with the reason “we changed
the way you pass flags to core
”, but you can’t break users with the reason
“we renamed -Cfoo
to -Cbar
; this is okay because you’re also using
-Zbuild-std
even though the rename is unrelated to -Zbuild-std
”.
Not all mismatches are unsound
This RFC says that mismatching target modifiers in any way results in a build error. However, there are a lot of cases where the real rules are more complicated than that. For example, with the following three CUs:
- CU A compiled with
-Zfixed-x18 -Zsanitizer=shadow-call-stack
. - CU B compiled with
-Zfixed-x18
. - CU C compiled with neither flag.
It is unsound to link together CUs A and C, but linking A with B or B with C is sound.
However, real-world scenarios where mismatching a target modifier is necessary are quite uncommon. The only case I’m aware of is the runtime for a sanitizer. For example, when ASAN (address sanitizer) detects a bug, it calls into a special ASAN-failure-handler function. The function for handling ASAN-failures should not be sanitized.
Making the compiler accept specific mismatches that are sound is out of scope
for this RFC. Such decisions will be made on a flag-by-flag basis in follow-up
decisions (most likely an MCP). Until then, end-users can use
-Cunsafe-allow-abi-mismatch
to proceed in such cases.
Cases that are not caught
This RFC proposes to store information in the crate metadata to detect ABI mismatches. However, this means that there are two cases that could result in mismatches being missed:
- When rustc is not doing the final link, different incompatible leaf modules might not get detected. For instance, using the CUs A,B,C from the previous section, then if stdlib is CU B and there are two leaf CUs A and C, then the incompatibility between A and C would not get detected unless rustc performs the final link.
- With dynamic linking, you may have two shared objects compiled completely separately with incompatible ABIs.
Note that the first situation can never happen with the base proposal: if we
require exact matches, then all CUs must agree because all CUs depend on
core
. The missed detection requires that we don’t consider AB or BC
incompatible. This could be an argument in favor of not allowing any mismatches
with the shadow call stack sanitizer (which you never want to mix in practice
anyway).
The dynamic linking case is considered acceptable. Detecting it is out of scope of this RFC.
Name mangling
It has been proposed that the modified target could be encoded in the name mangling scheme to help catch the two cases from the previous section. However, this raises a bunch of open questions:
- It probably does not help catch the first case. Dynamic function calls
between the leaf modules wouldn’t get caught, so that would require that one
of the leaf modules references a symbol defined by the other leaf module.
However, I find it hard to imagine this happening in the real world unless
the symbol is marked
#[no_mangle]
. - Similarly, if two dynamic objects are compiled completely separately, they
probably do not reference each other through anything other than symbols
marked
#[no_mangle]
. While it could potentially identify a mismatch where component A depends on component B, and component B is recompiled with a different ABI while using the old version of A, this scenario is not well-supported to start with because of Rust’s unstable ABI. - Some ABI-affecting flags only change the C ABI, but those symbols are
usually using
#[no_mangle]
. - Do we really want to make our symbol names even longer?
For the above reasons, name mangling is not proposed as a mechanism for detection for now. However, it could be a potential future addition.
Policy around flags that might not be ABI affecting
Some flags have an unclear status where it is unclear whether it affects the
ABI. For example, -Zpatchable-function-entry
(which adds nop instructions
before/after the function entrypoint) generally isn’t considered to affect the
ABI, but if combined with -Zsanitizer=kcfi
then it does affect it since kcfi
works by placing a hash of the function signature before the function
entrypoint. Since -Zsanitizer=kcfi
already needs to be a target modifier
anyway, you could argue that -Zpatchable-function-entry
doesn’t need to be
one.
However, for these cases where we are uncertain, we take a conservative approach and mark them as a target modifier. It is not breaking to relax the rules in future releases.
As for flags such as -Cllvm-args
that can do basically anything, it may make
more sense to just rename it to -Cunsafe-llvm-args
rather than use the target
modifier functionality.
Problems with mixing non-target-modifiers
I discussed this proposal with people from other communities (mainly kernel and C folks), and they shared several other cases where mixing flags are a problem. They pointed out that there are some flags where mixing them is really bad and should be detected, but which are not ABI issues or unsound per se. The most common example of this is exploit mitigations, where mixing the flags will silently lead to a vulnerable binary. On the other hand, ABI mismatches usually fail in a loud way, so they were not as concerned about those.
The sections below describe several such cases. They are intended to provide additional context for the reader to better understand the problem space. We will likely want to use the same infrastructure for detecting some of the mismatches mentioned below, but the precise list is out of scope of this RFC.
Since the cases below are not unsound, the flag for overriding them should not include the word “unsafe”.
Exploit mitigations
There are some mitigations that are used to mitigate CPU speculation vulnerabilities (e.g., SPECTRE) or used to make exploitation of vulnerabilities harder (e.g., control flow protection), which work by either telling the compiler to generate code that include instructions to prevent CPU speculation in some specific locations, or telling the compiler to generate code that checks that destinations of indirect branches are one of their valid destinations in the control flow graph. These mitigations usually don’t change the ABI, as they just change how code is generated within functions.
The problem is that the attacks you are trying to mitigate involve either forcing CPU speculation in some specific locations or changing the control flow to an arbitrary attacker-controlled address. If you have an unprotected specific location or unprotected indirect branch anywhere in your program, an attacker may still be able to use it either forcing CPU speculation in these unprotected specific locations, or by changing addresses in memory used by these unprotected indirect branches. This means that if you only apply the mitigation to some CUs, then the CUs that lack the mitigation will be completely unprotected, and the mitigation might be bypassable.
Sanitizers
This case is rather similar to exploit mitigations.
Some sanitizers can be mixed and matched between CUs without breaking the ABI. For example, on the Android aarch64 target, the shadow call stack sanitizer does not change the ABI, and can be freely mixed between CUs. However, the sanitizer does not catch violations in CUs that don’t enable the sanitizer.
For sanitizers used in production (such as shadow call stack or kcfi) this is particularly problematic, as a vulnerability in sanitized code may allow you to jump into unsanitized code.
.note.gnu.property
In the case of BTI (-Zbranch-protection=bti
), the mitigation relies on the
kernel’s ELF loader setting a special bit in the page table. However, setting
this bit is only valid if BTI is enabled everywhere. The compiler will use a
section called .note.gnu.property
to tell the linker whether BTI is enabled,
and the linker only propagates .note.gnu.property
if all CUs agree on it.
This means that if one CU is missing BTI, the linker will disable it for the
entire executable, and the kernel’s ELF loader will not set the bit in the page
tables when loading the machine code, rendering BTI ineffective.
Performance
Another reason is performance. One some targets, the precompiled stdlib always
comes with panic landing pads, even if you’re using -Cpanic=abort
. It’s also
usually compiled with a very minimal set of target features for greater
compatibility. These discrepancies can have an unacceptable impact on
performance.
Code patching
You might use -Zbranch-protection=pac-ret
or -Zpatchable-function-entry
to
insert special instructions at the beginning/end of all functions so you can
use runtime code-patching to replace them later. It is only because of the
runtime code-patching logic that these flags need to be used everywhere.
Debugging information
Mixing CUs with different options for -Cforce-unwind-tables
,
-Zdwarf-version
, or -Zdebuginfo-compression
may result in a binary that you
consider to be invalid as you may be unable to read the debugging information.
But it would not be an ABI issue.
Prior art
The panic strategy
The Rust compiler already has infrastructure to detect flag mismatches: the
flags -Cpanic
and -Zpanic-in-drop
. The prebuilt stdlib comes with different
pieces depending on which strategy is used, although panic landing flags are
not entirely removed when using -Cpanic=abort
, as only part of the prebuilt
stdlib is switched out.
Global target modifiers
A suggestion that has come up several times
(1,
2,
3)
is to have a variation of -Ctarget-feature=
that must be applied globally,
which could be called -Cglobal-target-features=
. This is very similar to this
RFC, though it is broader as the “target modifier” concept can apply to any
compiler flag and not just to a single -Cglobal-target-features=
flag.
Stabilization of things that require nightly features
This RFC proposes that we shouldn’t block stabilization of target modifiers on
a stable way to build libcore. There is precedent in the rust project for
unblocking stabilizations in this manner: When #![no_std]
was stabilized,
the RFC said the following:
As mentioned above, there are three separate lang items which are required by the libcore library to link correctly. These items are:
panic_fmt
stack_exhausted
eh_personality
This RFC does not attempt to stabilize these lang items for a number of reasons:
- The exact set of these lang items is somewhat nebulous and may change over time.
- The signatures of each of these lang items can either be platform-specific or it’s just “too weird” to stabilize.
- These items are pretty obscure and it’s not very widely known what they do or how they should be implemented.
Stabilization of these lang items (in any form) will be considered in a future RFC.
This means that no-std can’t actually be used without providing these symbols in some other way. Doing so is unstable.
.note.gnu.property
The .note.gnu.property
section discussed previously is an example of C code
detecting mismatches of a flag at link time.
Unresolved questions
This RFC does not stabilize any target modifiers. Such decisions should be made as a follow-up to this RFC on a flag-by-flag basis using the usual process for stabilizing a compiler flag.
The -Cunsafe-allow-abi-mismatch
flag will be stabilized when the first target
modifier is stabilized.
Future possibilities
A possible future extension could be to detect inconsistencies between the ABI of C code and Rust code. This would be an interesting extension, but it is not critical for target modifiers as calling into C is inherently unsafe to start with. Similarly, another possible future extension could be to catch ABI mismatches when using dynamic linking.