- Feature Name: NA
- Start Date: 2015-08-06
- RFC PR: rust-lang/rfcs#1240
- Rust Issue: rust-lang/rust#27060
Summary
Taking a reference into a struct marked repr(packed)
should become
unsafe
, because it can lead to undefined behaviour. repr(packed)
structs need to be banned from storing Drop
types for this reason.
Motivation
Issue #27060 noticed
that it was possible to trigger undefined behaviour in safe code via
repr(packed)
, by creating references &T
which don’t satisfy the
expected alignment requirements for T
.
Concretely, the compiler assumes that any reference (or raw pointer,
in fact) will be aligned to at least align_of::<T>()
, i.e. the
following snippet should run successfully:
let some_reference: &T = /* arbitrary code */;
let actual_address = some_reference as *const _ as usize;
let align = std::mem::align_of::<T>();
assert_eq!(actual_address % align, 0);
However, repr(packed)
allows on to violate this, by creating values
of arbitrary types that are stored at “random” byte addresses, by
removing the padding normally inserted to maintain alignment in
struct
s. E.g. suppose there’s a struct Foo
defined like
#[repr(packed, C)] struct Foo { x: u8, y: u32 }
, and there’s an
instance of Foo
allocated at a 0x1000, the u32
will be placed at
0x1001
, which isn’t 4-byte aligned (the alignment of u32
).
Issue #27060 has a snippet which crashes at runtime on at least two x86-64 CPUs (the author’s and the one playpen runs on) and almost certainly most other platforms.
#![feature(simd, test)]
extern crate test;
// simd types require high alignment or the CPU faults
#[simd]
#[derive(Debug, Copy, Clone)]
struct f32x4(f32, f32, f32, f32);
#[repr(packed)]
#[derive(Copy, Clone)]
struct Unalign<T>(T);
struct Breakit {
x: u8,
y: Unalign<f32x4>
}
fn main() {
let val = Breakit { x: 0, y: Unalign(f32x4(0.0, 0.0, 0.0, 0.0)) };
test::black_box(&val);
println!("before");
let ok = val.y;
test::black_box(ok.0);
println!("middle");
let bad = val.y.0;
test::black_box(bad);
println!("after");
}
On playpen, it prints:
before
middle
playpen: application terminated abnormally with signal 4 (Illegal instruction)
That is, the bad
variable is causing the CPU to fault. The let
statement is (in pseudo-Rust) behaving like let bad = load_with_alignment(&val.y.0, align_of::<f32x4>());
, but the
alignment isn’t satisfied. (The ok
line is compiled to a movupd
instruction, while the bad
is compiled to a movapd
: u
==
unaligned, a
== aligned.)
(NB. The use of SIMD types in the example is just to be able to
demonstrate the problem on x86. That platform is generally fairly
relaxed about pointer alignments and so SIMD & its specialised mov
instructions are the easiest way to demonstrate the violated
assumptions at runtime. Other platforms may fault on other types.)
Being able to assume that accesses are aligned is useful, for
performance, and almost all references will be correctly aligned
anyway (repr(packed)
types and internal references into them are
quite rare).
The problems with unaligned accesses can be avoided by ensuring that the accesses are actually aligned (e.g. via runtime checks, or other external constraints the compiler cannot understand directly). For example, consider the following
#[repr(packed, C)]
struct Bar {
x: u8,
y: u16,
z: u8,
w: u32,
}
Taking a reference to some of those fields may cause undefined
behaviour, but not always. It is always correct to take
a reference to x
or z
since u8
has alignment 1. If the struct
value itself is 4-byte aligned (which is not guaranteed), w
will
also be 4-byte aligned since the u8, u16, u8
take up 4 bytes, hence
it is correct to take a reference to w
in this case (and only that
case). Similarly, it is only correct to take a reference to y
if the
struct is at an odd address, so that the u16
starts at an even one
(i.e. is 2-byte aligned).
Detailed design
It is unsafe
to take a reference to the field of a repr(packed)
struct. It is still possible, but it is up to the programmer to ensure
that the alignment requirements are satisfied. Referencing
(by-reference, or by-value) a subfield of a struct (including indexing
elements of a fixed-length array) stored inside a repr(packed)
struct counts as taking a reference to the packed
field and hence is
unsafe.
It is still legal to manipulate the fields of a packed
struct by
value, e.g. the following is correct (and not unsafe
), no matter the
alignment of bar
:
let bar: Bar = ...;
let x = bar.y;
bar.w = 10;
It is illegal to store a type T
implementing Drop
(including a
generic type) in a repr(packed)
type, since the destructor of T
is
passed a reference to that T
. The crater run (see appendix) found no
crate that needs to use repr(packed)
to store a Drop
type (or a
generic type). The generic type rule is conservatively approximated by
disallowing generic repr(packed)
structs altogether, but this can be
relaxed (see Alternatives).
Concretely, this RFC is proposing the introduction of the // error
s
in the following code.
struct Baz {
x: u8,
}
#[repr(packed)]
struct Qux<T> { // error: generic repr(packed) struct
y: Baz,
z: u8,
w: String, // error: storing a Drop type in a repr(packed) struct
t: [u8; 4],
}
let mut qux = Qux { ... };
// all ok:
let y_val = qux.y;
let z_val = qux.z;
let t_val = qux.t;
qux.y = Baz { ... };
qux.z = 10;
qux.t = [0, 1, 2, 3];
// new errors:
let y_ref = &qux.y; // error: taking a reference to a field of a repr(packed) struct is unsafe
let z_ref = &mut qux.z; // ditto
let y_ptr: *const _ = &qux.y; // ditto
let z_ptr: *mut _ = &mut qux.z; // ditto
let x_val = qux.y.x; // error: directly using a subfield of a field of a repr(packed) struct is unsafe
let x_ref = &qux.y.x; // ditto
qux.y.x = 10; // ditto
let t_val = qux.t[0]; // error: directly indexing an array in a field of a repr(packed) struct is unsafe
let t_ref = &qux.t[0]; // ditto
qux.t[0] = 10; // ditto
(NB. the subfield and indexing cases can be resolved by first copying the packed field’s value onto the stack, and then accessing the desired value.)
Staging
This change will first land as warnings indicating that code will be broken, with the warnings switched to the intended errors after one release cycle.
Drawbacks
This will cause some functionality to stop working in
possibly-surprising ways (NB. the drawback here is mainly the
“possibly-surprising”, since the functionality is broken with general
packed
types.). For example, #[derive]
usually takes references to
the fields of structs, and so #[derive(Clone)]
will generate
errors. However, this use of derive is incorrect in general (no
guarantee that the fields are aligned), and, one can easily replace it
by:
#[derive(Copy)]
#[repr(packed)]
struct Foo { ... }
impl Clone for Foo { fn clone(&self) -> Foo { *self } }
Similarly, println!("{}", foo.bar)
will be an error despite there
not being a visible reference (println!
takes one internally),
however, this can be resolved by, for instance, assigning to a
temporary.
Alternatives
- A short-term solution would be to feature gate
repr(packed)
while the kinks are worked out of it - Taking an internal reference could be made flat-out illegal, and the times when it is correct simulated by manual raw-pointer manipulation.
- The rules could be made less conservative in several cases, however
the crater run didn’t indicate any need for this:
- a generic
repr(packed)
struct can use the generic in ways that avoids problems withDrop
, e.g. if the generic is bounded byCopy
, or if the type is only used in ways that areCopy
such as behind a*const T
. - using a subfield of a field of a
repr(packed)
struct by-value could be OK.
- a generic
Unresolved questions
None.
Appendix
Crater analysis
Crater was run on 2015/07/23 with a patch that feature gated repr(packed)
.
High-level summary:
- several unnecessary uses of
repr(packed)
(patches have been submitted and merged to remove all of these) - most necessary ones are to match the declaration of a struct in C
- many “necessary” uses can be replaced by byte arrays/arrays of smaller types
- 8 crates are currently on stable themselves (unsure about deps), 4 are already on nightly
- 1 of the 8, http2parse, is essentially only used by a nightly-only crate (tendril)
- 4 of the stable and 1 of the nightly crates don’t need
repr(packed)
at all
stable | needed | FFI only | |
---|---|---|---|
image | ✓ | ||
nix | ✓ | ✓ | ✓ |
tendril | ✓ | ||
assimp-sys | ✓ | ✓ | ✓ |
stemmer | ✓ | ||
x86 | ✓ | ✓ | ✓ |
http2parse | ✓ | ✓ | |
nl80211rs | ✓ | ✓ | ✓ |
openal | ✓ | ||
elfloader | ✓ | ✓ | |
x11 | ✓ | ||
kiss3d | ✓ |
More detailed analysis inline with broken crates. (Don’t miss kiss3d
in the non-root section.)
Regression report c85ba3e9cb4620c6ec8273a34cce6707e91778cb vs. 7a265c6d1280932ba1b881f31f04b03b20c258e5
- From: c85ba3e9cb4620c6ec8273a34cce6707e91778cb
- To: 7a265c6d1280932ba1b881f31f04b03b20c258e5
Coverage
- 2617 crates tested: 1404 working / 1151 broken / 40 regressed / 0 fixed / 22 unknown.
Regressions
- There are 11 root regressions
- There are 40 regressions
Root regressions, sorted by rank:
-
image-0.3.11 (before) (after)
- use seems entirely unnecessary (no raw bytewise operations on the struct itself)
On stable.
-
On stable.
-
tendril-0.1.2 (before) (after)
- use 1 not strictly necessary?
- use 2 required on 64-bit platforms to get size_of::<Header>() == 12 rather than 16.
- use 3, as above, does some precise tricks with the layout for optimisation.
Requires nightly.
-
assimp-sys-0.0.3 (before) (after)
-
many uses, required to match C structs (one example). In author’s words:
[11:36:15] <eljay> huon: well my assimp binding is basically abandoned for now if you are just worried about breaking things, and seems unlikely anyone is using it :P
On stable.
-
-
stemmer-0.1.1 (before) (after)
- use, completely unnecessary
On stable.
-
- several similar uses, specific layout necessary for raw interaction with CPU features
Requires nightly.
-
http2parse-0.0.3 (before) (after)
- use,
used to get super-fast “parsing” of headers, by transmuting
&[u8]
to&[Setting]
.
On stable, however:
[11:30:38] <huon> reem: why is https://github.com/reem/rust-http2parse/blob/b363139ac2f81fa25db504a9256face9f8c799b6/src/payload.rs#L208 packed? [11:31:59] <reem> huon: I transmute from & [u8] to & [Setting] [11:32:35] <reem> So repr packed gets me the layout I need [11:32:47] <reem> With no padding between the u8 and u16 [11:33:11] <reem> and between Settings [11:33:17] <huon> ok [11:33:22] <huon> (stop doing bad things :P ) [11:34:00] <huon> (there's some problems with repr(packed) https://github.com/rust-lang/rust/issues/27060 and we may be feature gating it) [11:35:02] <huon> reem: wait, aren't there endianness problems? [11:36:16] <reem> Ah yes, looks like I forgot to finish the Setting interface [11:36:27] <reem> The identifier and value methods take care of converting to types values [11:36:39] <reem> The goal is just to avoid copying the whole buffer and requiring an allocation [11:37:01] <reem> Right now the whole parser takes like 9 ns to parse a frame [11:39:11] <huon> would you be sunk if repr(packed) was feature gated? [11:40:17] <huon> or, is maybe something like `struct SettingsRaw { identifier: [u8; 2], value: [u8; 4] }` OK (possibly with conversion functions etc.)? [11:40:46] <reem> Yea, I could get around it if I needed to [11:40:58] <reem> Anyway the primary consumer is transfer and I'm running on nightly there [11:41:05] <reem> So it doesn't matter too much
- use,
used to get super-fast “parsing” of headers, by transmuting
-
nl80211rs-0.1.0 (before) (after)
- three similar uses to match C struct.
On stable.
-
openal-0.2.1 (before) (after)
- several similar uses,
probably unnecessary, just need the struct to behave like
[f32; 3]
: pointers to it are passed to functions expecting*mut f32
pointers.
On stable.
- several similar uses,
probably unnecessary, just need the struct to behave like
-
elfloader-0.0.1 (before) (after)
- two similar uses, required to match file headers/formats exactly.
Requires nightly.
-
x11cap-0.1.0 (before) (after)
- use unnecessary.
Requires nightly.
Non-root regressions, sorted by rank:
-
glium-0.8.0 (before) (after)
-
caribon-0.6.2 (before) (after)
-
assimp-0.0.4 (before) (after)
-
jamkit-0.2.4 (before) (after)
-
coap-0.1.0 (before) (after)
-
kiss3d-0.1.2 (before) (after)
- use seems to be unnecessary: semantically useless, just a space “optimisation”, which actually makes no difference because the Vec field will be appropriately aligned always.
On stable.
-
rustty-0.1.3 (before) (after)
-
spidev-0.1.0 (before) (after)
-
falcon-0.0.1 (before) (after)