Layout of scalar types
Disclaimer: This chapter represents the consensus from issue #9. The statements in here are not (yet) "guaranteed" not to change until an RFC ratifies them.
This documents the memory layout and considerations for bool
, char
, floating
point types (f{32, 64}
), and integral types ({i,u}{8,16,32,64,128,size}
).
These types are all scalar types, representing a single value, and have no
layout #[repr()]
flags.
bool
Rust's bool
has the same layout as C17's _Bool
, that is, its size
and alignment are implementation-defined. Any bool
can be
cast into an integer, taking on the values 1 (true
) or 0 (false
).
Note: on all platforms that Rust's currently supports, its size and alignment are 1, and its ABI class is
INTEGER
- see Rust Layout and ABIs.
char
Rust char is 32-bit wide and represents an unicode scalar value. The alignment
of char
is implementation-defined.
Note: Rust
char
type is not layout compatible with C / C++char
types. The C / C++char
types correspond to either Rust'si8
oru8
types on all currently supported platforms, depending on their signedness. Rust does not support C platforms in which Cchar
is not 8-bit wide.
isize
and usize
The isize
and usize
types are pointer-sized signed and unsigned integers.
They have the same layout as the pointer types for which the pointee is
Sized
, and are layout compatible with C's uintptr_t
and intptr_t
types.
Note: C99 7.18.2.4 requires
uintptr_t
andintptr_t
to be at least 16-bit wide. All platforms we currently support have a C platform, and as a consequence,isize
/usize
are at least 16-bit wide for all of them.
Note: Rust's
usize
and C'sunsigned
types are not equivalent. C'sunsigned
is at least as large as a short, allowed to have padding bits, etc. but it is not necessarily pointer-sized.
Note: in the current Rust implementation, the layouts of
isize
andusize
determine the following:
the maximum size of Rust allocations is limited to
isize::MAX
. The LLVMgetelementptr
instruction uses signed-integer field offsets. Rust callsgetelementptr
with theinbounds
flag which assumes that field offsets do not overflow,the maximum number of elements in an array is
usize::MAX
([T; N: usize]
). Only ZST arrays can probably be this large in practice, non-ZST arrays are bound by the maximum size of Rust values,the maximum value in bytes by which a pointer can be offseted using
ptr.add
orptr.offset
isisize::MAX
.These limits have not gone through the RFC process and are not guaranteed to hold.
Fixed-width integer types
For all Rust's fixed-width integer types {i,u}{8,16,32,64,128}
it holds that:
- these types have no padding bits,
- their size exactly matches their bit-width,
- negative values of signed integer types are represented using 2's complement.
Furthermore, Rust's signed and unsigned fixed-width integer types
{i,u}{8,16,32,64}
have the same layout as the C fixed-width integer types from
the <stdint.h>
header {u,}int{8,16,32,64}_t
. These fixed-width integer types
are therefore safe to use directly in C FFI where the corresponding C
fixed-width integer types are expected.
The alignment of Rust's {i,u}128
is unspecified and allowed to change.
Note: While the C standard does not define fixed-width 128-bit wide integer types, many C compilers provide non-standard
__int128
types as a language extension. The layout of{i,u}128
in the current Rust implementation does not match that of these C types, see rust-lang/#54341.
Layout compatibility with C native integer types
The specification of native C integer types, char
, short
, int
, long
,
... as well as their unsigned
variants, guarantees a lower bound on their size,
e.g., short
is at least 16-bit wide and at least as wide as char
.
Their exact sizes are implementation-defined.
Libraries like libc
use knowledge of this implementation-defined behavior on
each platform to select a layout-compatible Rust fixed-width integer type when
interfacing with native C integer types (e.g. libc::c_int
).
Note: Rust does not support C platforms on which the C native integer type are not compatible with any of Rust's fixed-width integer type (e.g. because of padding-bits, lack of 2's complement, etc.).
Fixed-width floating point types
Rust's f32
and f64
single (32-bit) and double (64-bit) precision
floating-point types have IEEE-754 binary32
and binary64
floating-point
layouts, respectively.
When the platforms' "math.h"
header defines the __STDC_IEC_559__
macro,
Rust's floating-point types are safe to use directly in C FFI where the
appropriate C types are expected (f32
for float
, f64
for double
).
If the C platform's "math.h"
header does not define the __STDC_IEC_559__
macro, whether using f32
and f64
in C FFI is safe or not for which C type is
implementation-defined.
Note: the
libc
crate uses knowledge of each platform's implementation-defined behavior to provide portablelibc::c_float
andlibc::c_double
types that can be used to safely interface with C via FFI.