Summary

Taking a reference into a struct marked repr(packed) should become unsafe, because it can lead to undefined behaviour. repr(packed) structs need to be banned from storing Drop types for this reason.

Motivation

Issue #27060 noticed that it was possible to trigger undefined behaviour in safe code via repr(packed), by creating references &T which don’t satisfy the expected alignment requirements for T.

Concretely, the compiler assumes that any reference (or raw pointer, in fact) will be aligned to at least align_of::<T>(), i.e. the following snippet should run successfully:

let some_reference: &T = /* arbitrary code */;

let actual_address = some_reference as *const _ as usize;
let align = std::mem::align_of::<T>();

assert_eq!(actual_address % align, 0);

However, repr(packed) allows on to violate this, by creating values of arbitrary types that are stored at “random” byte addresses, by removing the padding normally inserted to maintain alignment in structs. E.g. suppose there’s a struct Foo defined like #[repr(packed, C)] struct Foo { x: u8, y: u32 }, and there’s an instance of Foo allocated at a 0x1000, the u32 will be placed at 0x1001, which isn’t 4-byte aligned (the alignment of u32).

Issue #27060 has a snippet which crashes at runtime on at least two x86-64 CPUs (the author’s and the one playpen runs on) and almost certainly most other platforms.

#![feature(simd, test)]

extern crate test;

// simd types require high alignment or the CPU faults
#[simd]
#[derive(Debug, Copy, Clone)]
struct f32x4(f32, f32, f32, f32);

#[repr(packed)]
#[derive(Copy, Clone)]
struct Unalign<T>(T);

struct Breakit {
    x: u8,
    y: Unalign<f32x4>
}

fn main() {
    let val = Breakit { x: 0, y: Unalign(f32x4(0.0, 0.0, 0.0, 0.0)) };

    test::black_box(&val);

    println!("before");

    let ok = val.y;
    test::black_box(ok.0);

    println!("middle");

    let bad = val.y.0;
    test::black_box(bad);

    println!("after");
}

On playpen, it prints:

before
middle
playpen: application terminated abnormally with signal 4 (Illegal instruction)

That is, the bad variable is causing the CPU to fault. The let statement is (in pseudo-Rust) behaving like let bad = load_with_alignment(&val.y.0, align_of::<f32x4>());, but the alignment isn’t satisfied. (The ok line is compiled to a movupd instruction, while the bad is compiled to a movapd: u == unaligned, a == aligned.)

(NB. The use of SIMD types in the example is just to be able to demonstrate the problem on x86. That platform is generally fairly relaxed about pointer alignments and so SIMD & its specialised mov instructions are the easiest way to demonstrate the violated assumptions at runtime. Other platforms may fault on other types.)

Being able to assume that accesses are aligned is useful, for performance, and almost all references will be correctly aligned anyway (repr(packed) types and internal references into them are quite rare).

The problems with unaligned accesses can be avoided by ensuring that the accesses are actually aligned (e.g. via runtime checks, or other external constraints the compiler cannot understand directly). For example, consider the following

#[repr(packed, C)]
struct Bar {
    x: u8,
    y: u16,
    z: u8,
    w: u32,
}

Taking a reference to some of those fields may cause undefined behaviour, but not always. It is always correct to take a reference to x or z since u8 has alignment 1. If the struct value itself is 4-byte aligned (which is not guaranteed), w will also be 4-byte aligned since the u8, u16, u8 take up 4 bytes, hence it is correct to take a reference to w in this case (and only that case). Similarly, it is only correct to take a reference to y if the struct is at an odd address, so that the u16 starts at an even one (i.e. is 2-byte aligned).

Detailed design

It is unsafe to take a reference to the field of a repr(packed) struct. It is still possible, but it is up to the programmer to ensure that the alignment requirements are satisfied. Referencing (by-reference, or by-value) a subfield of a struct (including indexing elements of a fixed-length array) stored inside a repr(packed) struct counts as taking a reference to the packed field and hence is unsafe.

It is still legal to manipulate the fields of a packed struct by value, e.g. the following is correct (and not unsafe), no matter the alignment of bar:

let bar: Bar = ...;

let x = bar.y;
bar.w = 10;

It is illegal to store a type T implementing Drop (including a generic type) in a repr(packed) type, since the destructor of T is passed a reference to that T. The crater run (see appendix) found no crate that needs to use repr(packed) to store a Drop type (or a generic type). The generic type rule is conservatively approximated by disallowing generic repr(packed) structs altogether, but this can be relaxed (see Alternatives).

Concretely, this RFC is proposing the introduction of the // errors in the following code.

struct Baz {
    x: u8,
}

#[repr(packed)]
struct Qux<T> { // error: generic repr(packed) struct
    y: Baz,
    z: u8,
    w: String, // error: storing a Drop type in a repr(packed) struct
    t: [u8; 4],
}

let mut qux = Qux { ... };

// all ok:
let y_val = qux.y;
let z_val = qux.z;
let t_val = qux.t;
qux.y = Baz { ... };
qux.z = 10;
qux.t = [0, 1, 2, 3];

// new errors:

let y_ref = &qux.y; // error: taking a reference to a field of a repr(packed) struct is unsafe
let z_ref = &mut qux.z; // ditto
let y_ptr: *const _ = &qux.y; // ditto
let z_ptr: *mut _ = &mut qux.z; // ditto

let x_val = qux.y.x; // error: directly using a subfield of a field of a repr(packed) struct is unsafe
let x_ref = &qux.y.x; // ditto
qux.y.x = 10; // ditto

let t_val = qux.t[0]; // error: directly indexing an array in a field of a repr(packed) struct is unsafe
let t_ref = &qux.t[0]; // ditto
qux.t[0] = 10; // ditto

(NB. the subfield and indexing cases can be resolved by first copying the packed field’s value onto the stack, and then accessing the desired value.)

Staging

This change will first land as warnings indicating that code will be broken, with the warnings switched to the intended errors after one release cycle.

Drawbacks

This will cause some functionality to stop working in possibly-surprising ways (NB. the drawback here is mainly the “possibly-surprising”, since the functionality is broken with general packed types.). For example, #[derive] usually takes references to the fields of structs, and so #[derive(Clone)] will generate errors. However, this use of derive is incorrect in general (no guarantee that the fields are aligned), and, one can easily replace it by:

#[derive(Copy)]
#[repr(packed)]
struct Foo { ... }

impl Clone for Foo { fn clone(&self) -> Foo { *self } }

Similarly, println!("{}", foo.bar) will be an error despite there not being a visible reference (println! takes one internally), however, this can be resolved by, for instance, assigning to a temporary.

Alternatives

  • A short-term solution would be to feature gate repr(packed) while the kinks are worked out of it
  • Taking an internal reference could be made flat-out illegal, and the times when it is correct simulated by manual raw-pointer manipulation.
  • The rules could be made less conservative in several cases, however the crater run didn’t indicate any need for this:
    • a generic repr(packed) struct can use the generic in ways that avoids problems with Drop, e.g. if the generic is bounded by Copy, or if the type is only used in ways that are Copy such as behind a *const T.
    • using a subfield of a field of a repr(packed) struct by-value could be OK.

Unresolved questions

None.

Appendix

Crater analysis

Crater was run on 2015/07/23 with a patch that feature gated repr(packed).

High-level summary:

  • several unnecessary uses of repr(packed) (patches have been submitted and merged to remove all of these)
  • most necessary ones are to match the declaration of a struct in C
  • many “necessary” uses can be replaced by byte arrays/arrays of smaller types
  • 8 crates are currently on stable themselves (unsure about deps), 4 are already on nightly
    • 1 of the 8, http2parse, is essentially only used by a nightly-only crate (tendril)
    • 4 of the stable and 1 of the nightly crates don’t need repr(packed) at all
stableneededFFI only
image
nix
tendril
assimp-sys
stemmer
x86
http2parse
nl80211rs
openal
elfloader
x11
kiss3d

More detailed analysis inline with broken crates. (Don’t miss kiss3d in the non-root section.)

Regression report c85ba3e9cb4620c6ec8273a34cce6707e91778cb vs. 7a265c6d1280932ba1b881f31f04b03b20c258e5

  • From: c85ba3e9cb4620c6ec8273a34cce6707e91778cb
  • To: 7a265c6d1280932ba1b881f31f04b03b20c258e5

Coverage

  • 2617 crates tested: 1404 working / 1151 broken / 40 regressed / 0 fixed / 22 unknown.

Regressions

  • There are 11 root regressions
  • There are 40 regressions

Root regressions, sorted by rank:

  • image-0.3.11 (before) (after)

    • use seems entirely unnecessary (no raw bytewise operations on the struct itself)

    On stable.

  • nix-0.3.9 (before) (after)

    On stable.

  • tendril-0.1.2 (before) (after)

    • use 1 not strictly necessary?
    • use 2 required on 64-bit platforms to get size_of::<Header>() == 12 rather than 16.
    • use 3, as above, does some precise tricks with the layout for optimisation.

    Requires nightly.

  • assimp-sys-0.0.3 (before) (after)

    • many uses, required to match C structs (one example). In author’s words:

      [11:36:15] <eljay> huon: well my assimp binding is basically abandoned for now if you are just worried about breaking things, and seems unlikely anyone is using it :P

    On stable.

  • stemmer-0.1.1 (before) (after)

    • use, completely unnecessary

    On stable.

  • x86-0.2.0 (before) (after)

    Requires nightly.

  • http2parse-0.0.3 (before) (after)

    • use, used to get super-fast “parsing” of headers, by transmuting &[u8] to &[Setting].

    On stable, however:

    [11:30:38] <huon> reem: why is https://github.com/reem/rust-http2parse/blob/b363139ac2f81fa25db504a9256face9f8c799b6/src/payload.rs#L208 packed?
    [11:31:59] <reem> huon: I transmute from & [u8] to & [Setting]
    [11:32:35] <reem> So repr packed gets me the layout I need
    [11:32:47] <reem> With no padding between the u8 and u16
    [11:33:11] <reem> and between Settings
    [11:33:17] <huon> ok
    [11:33:22] <huon> (stop doing bad things :P )
    [11:34:00] <huon> (there's some problems with repr(packed) https://github.com/rust-lang/rust/issues/27060 and we may be feature gating it)
    [11:35:02] <huon> reem: wait, aren't there endianness problems?
    [11:36:16] <reem> Ah yes, looks like I forgot to finish the Setting interface
    [11:36:27] <reem> The identifier and value methods take care of converting to types values
    [11:36:39] <reem> The goal is just to avoid copying the whole buffer and requiring an allocation
    [11:37:01] <reem> Right now the whole parser takes like 9 ns to parse a frame
    [11:39:11] <huon> would you be sunk if repr(packed) was feature gated?
    [11:40:17] <huon> or, is maybe something like `struct SettingsRaw { identifier:  [u8; 2], value:  [u8; 4] }` OK (possibly with conversion functions etc.)?
    [11:40:46] <reem> Yea, I could get around it if I needed to
    [11:40:58] <reem> Anyway the primary consumer is transfer and I'm running on nightly there
    [11:41:05] <reem> So it doesn't matter too much
    
  • nl80211rs-0.1.0 (before) (after)

    On stable.

  • openal-0.2.1 (before) (after)

    On stable.

  • elfloader-0.0.1 (before) (after)

    Requires nightly.

  • x11cap-0.1.0 (before) (after)

    • use unnecessary.

    Requires nightly.

Non-root regressions, sorted by rank: