&uninit references
Introduce a first-class &uninit T pointer type whose initialisation state is
tracked by the compiler. Make it possible to return a marker proving
full-initialisation status of the &uninit T from functions.
Semantics
An &uninit T is a data pointer pointing to uninitialised, partially
initialised, or fully initialised data. The state of the initialisation is not
queryable using runtime functions; it is tracked at compile-time only.
The &uninit T is reborrowable into another &uninit T. The lifetime 'a in
&'a uninit T is invariant and guaranteed unique. An initialised field of an
&uninit T can be reborrowed as &(mut) Field.
When initially received, be it by creation or as a parameter, the &uninit T is
fully uninitialised. When dropped or reborrowed, the &uninit T drops the T
in place if initialised, or drops all initialised fields of T in place
otherwise.
An &uninit T can be split into multiple &uninit Field references in the same
way as a &mut T can be split into &mut Field references.
An &uninit T can be initialised by writing into it, or by using an
initialisation proof marker Initialised<'_>. The lifetime of 'a of
Initialised<'a> is invariant. A field-wise or fully initialised &uninit T
can be reborrowed into an Initialised<'_>, which resets the &uninit T’s
initialisation status to fully uninitialised without performing Drop in place.
Initialising &uninit T or a local uninitialised T
One-shot initialisation
An &uninit T can be fully initialised by writing to it.
#![allow(unused)]
fn main() {
let r: &uninit T = ?;
*r = T::new();
}
This marks the &uninit T as fully initialised and arms its Drop method.
Equivalently for a local uninitialised T:
#![allow(unused)]
fn main() {
let r: T;
r = T::new();
}
Field-wise initialisation
An &uninit T where T is a struct can be initialised field-by-field using a
special initialisation syntax:
#![allow(unused)]
fn main() {
let r: &uninit T = ?;
r.field1 <- Field1::new();
r.field2 <- Field2::new();
r <- _;
}
Alternatively, reuse normal assignment syntax:
#![allow(unused)]
fn main() {
let r: &uninit T = ?;
r.field1 = Field1::new();
r.field2 = Field2::new();
*r = _;
}
Initialising a field makes the borrow checker track that field’s value as an
individual value. Initialising all fields still tracks the fields as individual
values, ie. it does not consider the &uninit T to yet contain an initialised
T and does not therefore arm the Drop method of T.
The final r <- _; or *r = _; line is therefore required to complete the
initialisation; this finally arms the Drop method of T.
Equivalently for a local uninitialised T:
#![allow(unused)]
fn main() {
let r: T;
r.field1 = Field1::new();
r.field2 = Field2::new();
r = _;
}
Initialisation functions
An &uninit T can be passed to a function as a parameter: the callee will
consider the &uninit T to be fully uninitialised. The callee can signal to the
caller that it has fully initialised the &uninit T by returning an
initialisation proof, here called Initialised<'_>.
#![allow(unused)]
fn main() {
let r: &uninit T = ?;
let proof: Initialised<'_> = init_t(r);
r <- proof;
}
The initialisation proof is “notarised” onto the &uninit T using the r <- proof; syntax. An alternative is to use the standard pointer write syntax:
#![allow(unused)]
fn main() {
let r: &uninit T = ?;
*r = init_t(r);
}
This requires special handling in the compiler as Initialised<'_> is not equal to T.
Another alternative would be to make dropping the proof automatically notarise the &uninit T:
#![allow(unused)]
fn main() {
let r: &uninit T = ?;
init_t(r);
}
This requires special handling in the compiler as dropping of Initialised<'_>
would have to happen immediately on the second line above, and its Drop
implementation would have to find the exact r: &uninit T based on the
invariant and guaranteed unique lifetime that they share, and notarise it.
A local uninitialised T cannot be initialised using an initialisation function
without taking an &uninit T reference to it:
#![allow(unused)]
fn main() {
let r: T;
init_t(&uninit t);
}
Justification
We biased towards denying drops of Initialised<'_> other than use for marking place
initialised because the type is intended for notarisation in the current function call
frame.
For more sophisticated post-initialisation manipulation of the initialised data,
we believe that it is more ergonomic and less error-prone if notarisation happens first
and the access to the data is handed off to other sub-routines via &mut borrows.
Syntax sugar
Having to write out &uninit is a nuisance in most case. Many in-place
initialisation cases are dead-simple. For these cases, it would make sense to
have simple syntax sugar to deal with the nuisance.
One possibility would be to use the magic _ binding on the right-hand side of
an assignment with the meaning of “references the left-hand side”; this
reference would necessarily be a an &uninit T since the left-hand side must be
uninitialised when the right-hand side is being evaluated.
This makes calling constructor functions much more pleasant:
#![allow(unused)]
fn main() {
fn init_r(r: &uninit Struct) -> Initialised<'_> { ... }
let r = init_r(_);
}
This would also work with fields:
#![allow(unused)]
fn main() {
struct Struct {
field1: Field1,
field2: Field2,
}
let r: Struct;
r.field1 = init_field1(_); // &uninit Field1 -> Initialised
r.field2 = init_field2(_); // &uninit Field2 -> initialised
r = _;
}
This also applies to r = _; which now desugars into r = &uninit r: if all of
r’s fields are fully initialised then &uninit r can be reborrowed into an
Initialised<'_>. The initialisation proof can then be assigned into r to
finish its initialisation.
Pros & cons
Pros
-
Out pointers are the way that in-place initialisation actually works on the concrete, on-the-metal level. You cannot make the problem simpler than it actually is.
-
Out pointers are explicit and flexible: initialising functions (constructors) are free to choose their calling convention, and functions taking multiple out pointers are not an issue.
-
Initialisation proofs enable making
&uninit Treborrowable: an alternative approach of returning&init Tpointers requires&uninit Tto be aMove-only type. This also enables very efficient initialisation function APIs.
Cons
-
&uninit TandInitialised<'_>are often explicitly spelled out; they are more verbose than automatic solutions. Syntax sugar helps a lot though. -
The implementation requires a non-trivial amount of new compiler features.
- The characterisation of
Initialised<'_>is that every instance must be discharged eventually, implying that the borrow-checker must also check for this at each return point. - In some cases, it will require compiler to perform optimisation to eliminate panicking.
- The characterisation of
-
&uninit Tdoes not itself provide a direct path to solving eg. in-placeBoxorRcinitialisation. The most direct solution of adding new APIs that take animpl FnOnce(&'a uninit T) -> Initialised<'a>run into the same issues asimpl Initdoes: return type of theFnOncespills into the return type of the new APIs, and the new APIs will need various variants to match status quo, leading to API bloat.
NOTE
On the note of enforcing no-drop of Initialised<'_>, the back-up proposal,
if this property would be found objectionable, is to weaken this requirement and
adopt a type that still carries a lifetime with invariance as well as the same
drop implementation of the emplaced type T.
This is also known as &'_ init T or &'_ own T type, because the model will
require the type to contain the pointer to the place under emplacement in order
for dropping to take place.
Examples
Prospective library extension
Out-pointers do not themselves solve the question how smart pointers like Box
would perform in-place initialisation.
However, with sensible library design we believe that out-pointers can provide
better ergonomics when it comes to balancing between convenience, safety, and
flexibility.
#![allow(unused)]
fn main() {
impl<'a, T, A> BoxBuilder<'a, T, A> {
pub fn new_init(alloc: A) -> (&'a uninit T, Self) {
let ptr = if T::IS_ZST {
NonNull::dangling()
} else {
let layout = Layout::new::<mem::MaybeUninit<T>>();
alloc.allocate(layout).unwrap().cast()
};
(ptr as &'a uninit _, Self { ptr, alloc })
}
/// An example for fallible allocation, take 2.
pub fn new_try_init(alloc: A) -> Result<(&'a uninit T, Self), AllocError> {
let ptr = if T::IS_ZST {
NonNull::dangling()
} else {
let layout = Layout::new::<mem::MaybeUninit<T>>();
alloc.allocate(layout)?.cast()
};
Ok((ptr as &'a uninit _, Self { ptr, alloc }))
}
/// An example for fallible allocation.
pub fn new_opt_init(alloc: A) -> Option<(&'a uninit T, Self)> {
let ptr = if T::IS_ZST {
NonNull::dangling()
} else {
let layout = Layout::new::<mem::MaybeUninit<T>>();
alloc.allocate(layout).ok()?.cast()
};
Some((ptr as &'a uninit _, Self { ptr, alloc }))
}
pub fn finalise(self, initialize: Initialized<'a>) -> Box<T, A> {
unsafe {
// Safety: discharge initialize because we are going to set
// the Unique as initialized
initialize.discharge();
// Safety: we make a switch on the init state now.
Box::from_raw_in(self.ptr, self.alloc)
}
}
}
struct Struct {
data1: [u8; 32],
data2: [u8; 32],
}
let (uninit_struct, builder) = BoxBuilder::<'_, Struct>::new_init(Global);
uninit_struct.data1 <- [0; 32];
uninit_struct.data2 <- [4; 32];
let box_struct = builder.finalise(uninit_struct);
/// With this, convenient and opinionated Box emplacing constructors can be
/// built.
impl<T, A> Box<T, A> {
pub fn new_init(
alloc: A,
ctor: impl for<'a> FnOnce(&'a uninit T) -> Initialised<'a>,
) -> Self
{
let (uninit_struct, builder) = BoxBuilder::<'_, T>::new_init(alloc);
let initialize = ctor(&uninit_struct);
builder.finalise(initialize)
}
pub fn new_opt_init(
alloc: A,
ctor: impl for<'a> FnOnce(&'a uninit T) -> Option<Initialised<'a>>,
) -> Option<Self>
{
let (uninit_struct, builder) = BoxBuilder::<'_, T>::new_opt_init(alloc)?;
ctor(&uninit_struct).map(|initialize| {
builder.finalise(initialize)
})
}
pub fn new_try_init<F, E>(
alloc: A,
ctor: F,
) -> Result<Self, E>
where
F: for<'a> FnOnce(&'a uninit T) -> Result<Initialised<'a>, E>,
E: 'static + From<AllocError>,
{
let (uninit_struct, builder) = BoxBuilder::<'_, T>::new_try_init(alloc)?;
Ok(builder.finalise(ctor(&uninit_struct)?))
}
}
}
Correct usage
These are examples of correct and recommended usage.
One-shot initialisation
#![allow(unused)]
fn main() {
struct Struct {
field1: Box<u32>,
field2: Box<u32>,
}
fn init_s(s: &uninit Struct) -> Initialised<'_> {
*s = Struct::default();
s
}
let s = init_s(_);
}
Field-wise initialisation
#![allow(unused)]
fn main() {
struct Struct {
field1: Box<u32>,
field2: Box<u32>,
}
fn init_b(s: &uninit Box<u32>) -> Initialised<'_> {
*s = Default::default();
}
let s: Struct;
s.field1 = init_b(_);
s.field2 = init_b(_);
s = _;
}
C++ constructor ABI
A C++ constructor takes an *mut self parameter and returns it as (hopefully)
initialised. Implementing the equivalent with &uninit T is possible, though it
requires either that Initialised<'_> can be wrapped in ManuallyDrop (so it
must not be a true linear type), or that the planned unsafe fn drop_in_place
overriding feature is used.
#![allow(unused)]
fn main() {
struct Class { ... };
#[repr(transparent)]
struct Init<'a, T>(&'a uninit T, Initialised<'a>);
impl Destruct for Init<'_, T> {
unsafe fn drop_in_place(&mut self) {
// NOTE: self.0 is considered fully uninitialised here.
// Mark self.0 as fully initialised: how to move out of self.1 though?
self.0 = self.1;
// Here self.0 is exiting the function and gets dropped in place.
}
}
extern "C" fn lib__Class__new(c: &mut uninit Class) -> Init<'_, Class> {
c.field1 = init_field1(_);
// ... other field inits here ...
// notarise field-wise initialised &uninit Class, arming its `Drop` and then
// immediately moving the Drop responsibility into proof.
let proof: Initialised = c;
// NOTE: c is now considered fully uninitialised as proof carries Drop
// responsibility. Class is not uninitialised here.
Init(c, proof)
}
}
Calling a C++ constructor
#![allow(unused)]
fn main() {
struct Class { ... };
#[repr(transparent)]
struct Init<'a, T>(&'a uninit T, Initialised<'a>);
impl Destruct for Init<'_, T> { ... } // same as above
impl<'a, T> Init<'a, T> {
fn into_proof(self) -> Initialised<'a> {
// NOTE: self.0 is considered fully uninitialised here.
self.1
// NOTE: self.0 is still considered fully uninitialised here and thus no
// drop in place is performed.
}
}
#[link(name = "lib")]
unsafe extern "C" {
fn lib__Class__new<'a>(&'a uninit Class) -> Init<'a, Class>;
}
let c: Class = unsafe { lib__Class__new(_) }.into_proof();
}
Fallible initialisation
#![allow(unused)]
fn main() {
fn try_init_s(s: &uninit Struct) -> Result<Initialised<'_>, dyn Error> { ... }
let s = try_init_s(_)?;
}
Multiple out pointers
#![allow(unused)]
fn main() {
fn init_field1_and_field2<'a, 'b>(
v: u32,
field1: &'a uninit Field1,
field2: &'b uninit Field2
) -> (Initialised<'a>, Initialised<'b>) { ... }
let s: Struct;
init_field1_and_field2(3, &uninit s.field1, &uninit s.field2);
}
Incorrect usage examples
These are examples of incorrect usage that must not compile.
Reference to partially initialised struct
#![allow(unused)]
fn main() {
struct Struct {
field1: Box<u32>,
field2: Box<u32>,
}
fn init_b(s: &uninit Box<u32>) -> Initialised<'_> {
*s = Default::default();
}
let s: Struct;
s.field1 = init_b(_);
let: &Struct = &s; // ERROR: used binding `s` isn't initialized
s.field2 = init_b(_);
let: &Struct = &s; // ERROR: used binding `s` isn't initialized
s = _;
let: &Struct = &s; // OK
}
Misuse examples
These are examples of correct but problematic usage: they compile but contain mistakes.
Partial initialisation undone
#![allow(unused)]
fn main() {
struct Struct {
field1: Box<u32>,
field2: Box<u32>,
}
fn init_s(s: &uninit Struct) -> Initialised<'_> {
// NOTE: s is considered fully uninitialised at function entry here.
*s = Struct::default();
s
}
let s: Struct;
s.field1 = Default::default();
// MISTAKE: field1 was initialised and is dropped here.
s = init_s(_);
}
Partial initialisation undone by return
#![allow(unused)]
fn main() {
struct Struct {
field1: Box<u32>,
field2: Box<u32>,
}
fn half_init_s(s: &uninit Struct) {
s.field1 = Default::default();
// MISTAKE: there is no way to return the half-initialised state. Thus,
// field1 is dropped here.
}
let s: Struct;
half_init_s(&uninit s);
s = Default::default();
}
Field-wise initialisation unfinished
#![allow(unused)]
fn main() {
struct Struct {
field1: Box<u32>,
field2: Box<u32>,
}
impl Drop for Struct {
fn drop(&mut self) {
eprintln!("Dropped");
}
}
fn init_b(s: &uninit Box<u32>) -> Initialised<'_> {
*s = Default::default();
}
let s: Struct;
s.field1 = init_b(_);
s.field2 = init_b(_);
// MISTAKE: s is never notarised and thus "Dropped" will not be printed.
}
API sketch
Backlinks