The Taxonomy of Pointers

Writing unsafe in Rust usually involves manual management of memory. Although, ideally, we’d like to exclusively use references for this, sometimes the constraints they apply are too strong. This post is a guide on those constraints and how to weaken them for correctness.

“Unmanaged” languages, like C++ and Rust, provide pointer types for manipulating memory. These types serve different purposes and provide different guarantees. These guarantees are useful for the optimizer but get in the way of correctness of low-level code. This is especially true in Rust, where these constraints are very tight.

NB: This post only surveys data pointers. Function pointers are their own beast, but generally are less fussy, since they all have static lifetime1.

Basic C++ Pointers

First, let’s survey C++. We have three pointer types: the traditional C pointer T*, C++ references T&, and rvalue references T&&. These generally have pretty weak guarantees.

Pointers provide virtually no guarantees at all: they can be null, point to uninitialized memory, or point to nothing at all! C++ Only requires that they be aligned2. They are little more than an address (until they are dereferenced, of course).

References, on the other hand, are intended to be the “primary” pointer type. A T& cannot be null, is well-aligned, and is intended to only refer to live memory (although it’s not something C++ can really guarantee for lack of a borrow-checker). References are short-lived.

C++ uses non-nullness to its advantage. For example, Clang will absolutely delete code of the form

auto& x = Foo();
if (&x == nullptr) {

Because references cannot be null, and dereferencing the null pointer is always UB, the compiler may make this fairly strong assumption.

Rvalue references, T&&, are not meaningfully different from normal references, beyond their role in overload resolution.

Choosing a C++ (primitive) pointer type is well-studied and not the primary purpose of this blog. Rather, we’re interested in how these map to Rust, which has significantly more complicated pointers.

Basic Rust Pointers

Like C++, Rust has two broad pointer types: *const T and *mut T, the raw pointers, and &T and &mut T, the references.

Rust pointer have even fewer constraints than C++ pointers; they need not even be aligned3! The const/mut specifier is basically irrelevant, but is useful as programmer book-keeping tool. Rust also does not enforce the dreaded strict-aliasing rule4 on its pointers.

On the other hand, Rust references are among the most constrained objects in any language that I know of. A shared reference &'a T, lasting for the lifetime 'a, satisfies:

  • Non-null, and well-aligned (like in C++).
  • Points to a valid, initialized T for the duration of 'a.
  • T is never ever mutated for the duration of the reference: the compiler may fold separate reads into one at will. Stronger still, no &mut T is reachable from any thread while the reference is reachable.

Stronger still are &'a mut T references, sometimes called unique references, because in addition to being well-aligned and pointing to a valid T at all times, no other reachable reference ever aliases it in any thread; this is equivalent to a C T* restrict pointer.

Unlike C++, which has two almost-identical pointer types, Rust’s two pointer types provide either no guarantees or all of them. The following unsafe operations are all UB:

let null = unsafe { &*ptr::null() };

// A reference to u8 need not be sufficiently aligned
// for a reference to u32.
let unaligned = unsafe { &*(&0u8 as *const u8 as *const u32) };

// More on this type later...
let uninit = unsafe { &*MaybeUninit::uninit().as_ptr() };

let x = 0;
unsafe {
  // Not UB in C++ with const_cast!
  let p = &x;
  (p as *const i32 as *mut i32).write(42);

// Two mutable references live at the same time pointing to
// the same memory. This would also be fine in C++!
let mut y = 0;
let p1 = unsafe { &*(&mut y as *mut i32) };
let p2 = unsafe { &*(&mut y as *mut i32) };

Wide Pointers

Rust also provides the slice types &[T]5 (of which you get mutable/immutable reference and pointer varieties) and dynamic trait object types &dyn Tr (again, all four basic pointer types are available).

&[T] is a usize6 length plus a pointer to that many Ts. The pointer type of the slice specifies the guarantees on the pointed-to buffer. *mut [T], for example, has no meaningful guarantees, but still contains the length7. Note that the length is part of the pointer value, not the pointee.

&dyn Tr is a trait object. For our purposes, it consists of a pointer to some data plus a pointer to a static vtable. *mut dyn Tr is technically a valid type8. Overall, trait objects aren’t really relevant to this post; they are rarely used this way in unsafe settings.

Weakening the Guarantees

Suppose we’re building some kind of data structure; in Rust, data structures will need some sprinkling of unsafe, since they will need to shovel around memory directly. Typically this is done using raw pointers, but it is preferable to use the least weakened pointer type to allow the compiler to perform whatever optimizations it can.

There are a number of orthogonal guarantees on &T and &mut T we might want to relax:

  • Non-nullness.
  • Well-aligned-ness.
  • Validity and initialized-ness of the pointee.
  • Allocated-ness of the pointee (implied by initialized-ness).
  • Global uniqueness of an &mut T.

Pointer to ZST

The last three of these properties are irrelevant for a zero-sized type. For example, we can generate infinite &mut () with no consequences:

fn unique_unit() -> &'static mut () {
  unsafe { &mut *(0x1 as *mut ()) } 

We materialize a non-null, well-aligned pointer and reborrow it into a static reference; because there is no data to point to, none of the usual worries about the pointee itself apply. However, the pointer itself must still be non-null and well-aligned; 0x1 is not a valid address for an &[u32; 0], but 0x4 is9.

This also applies to empty slices; in fact, the compiler will happily promote the expression &mut [] to an arbitrary lifetime:

fn unique_empty<T>() -> &'static mut [T] {
  &mut []

Null References

The most well-known manner of weakening is Option<&T>. Rust guarantees that this is ABI-compatible with a C pointer const T*, with Option::<&T>::None being a null pointer on the C side. This “null pointer optimization” applies to any type recursively containing at least one T&.

extern "C" {
  fn DoSomething(ptr: Option<&mut u32>);

fn do_something() {
  DoSomething(None);  // C will see a `NULL` as the argument.

The same effect can be achieved for a pointer type using the NonNull<T> standard library type: Option<NonNull<T>> is identical to *mut T. This is most beneficial for types which would otherwise contain a raw pointer:

struct Vec<T> {
  ptr: NonNull<T>,
  len: usize,
  cap: usize,

assert_eq!(size_of::<Vec<u8>>(), size_of::<Option<Vec<u8>>>())

Uninitialized Pointee

No matter what, a &T cannot point to uninitialized memory, since the compiler is free to assume it may read such references at any time with no consequences.

The following classic C pattern is verboten:

Foo foo;

Rust doesn’t provide any particularly easy ways to allocate memory without initializing it, too, so this usually isn’t a problem. The MaybeUninit<T> type can be used for safely allocating memory without initializing it, via MaybeUninit::uninit().

This type acts as a sort of “optimization barrier” that prevents the compiler from assuming the pointee is initialized. &MaybeUninit<T> is a pointer to potentially uninitialized but definitely allocated memory. It has the same layout as &T, and Rust provides functions like assume_init_ref() for asserting that a &MaybeUninit<T> is definitely initialized. This assertion is similar in consequence to dereferencing a raw pointer.

&MaybeUninit<T> and &mut MaybeUninit<T> should almost be viewed as pointer types in their own right, since they can be converted to/from &T and &mut T under certain circumstances.

Because T is almost a “subtype” of MaybeUninit<T>, we are entitled10 to “forget” that the referent of a &T is initialized converting it to a &MaybeUninit<T>. This makes sense because &T is covariant11 in &T. However, this is not true of &mut T, since it’s not covariant:

let mut x = 0;
let uninit: &mut MaybeUninit<i32> = unsafe { transmute(&mut x) };
*uninit = MaybeUninit::uninit();  // Oops, `x` is now uninit!

These types are useful for talking to C++ without giving up too many guarantees. Option<&MaybeUninit<T>> is an almost perfect model of a const T*, under the assumption that most pointers in C++ are valid most of the time.

MaybeUninit<T> also finds use in working with raw blocks of memory, such as in a Vec-style growable slice:

struct SliceVec<'a, T> {
  // Backing memory. The first `len` elements of it are
  // known to be initialized, but no more than that.
  data: &'a mut [MaybeUninit<T>],
  len: usize,

impl SliceVec<'a, T> {
  fn push(&mut self, x: T) {
    assert!(self.len < data.len());[self.len] = MaybeUninit::new(x);
    self.len += 1;

Aliased Pointee

&mut T can never alias any other pointer, but is also the mechanism by which we perform mutation. It can’t even alias with pointers that Rust can’t see; Rust assumes no one else can touch this memory. Thus, &mut T is not an appropriate analogue for T&.

Like with uninitialized memory, Rust provides a “barrier” wrapper type, UnsafeCell<T>. UnsafeCell<T> is the “interior mutability” primitive, which permits us to mutate through an &UnsafeCell<T> so long as concurrent reads and writes do not occur. We may even convert it to a &mut T when we’re sure we’re holding the only reference.

UnsafeCell<T> forms the basis of the Cell<T>, RefCell<T>, and Mutex<T> types, each of which performs a sort of “dynamic borrow-checking”:

  • Cell<T> only permits direct loads and stores.
  • RefCell<T> maintains a counter of references into it, which it uses to dynamically determine if a mutable reference would be unique.
  • Mutex<T>, which is like RefCell<T> but using concurrency primitives to maintain uniqueness.

Because of this, Rust must treat &UnsafeCell<T> as always aliasing, but because we can mutate through it, it is a much closer analogue to a C++ T&. However, because &T assumes the pointee is never mutated, it cannot coexist with a &UnsafeCell<T> to the same memory, if mutation is performed through it. The following is explicitly UB:

let mut x = 0;
let p = &x;

// This is ok; creating the reference to UnsafeCell does not
// immediately trigger UB.
let q = unsafe { transmute::<_, &UnsafeCell<i32>>(&x) };

// But writing to it does!

The Cell<T> type is useful for non-aliasing references to plain-old-data types, which tend to be Copy. It allows us to perform mutation without having to utter unsafe. For example, the correct type for a shared mutable buffer in Rust is &[Cell<u8>], which can be freely memcpy‘d, without worrying about aliasing12.

This is most useful for sharing memory with another language, like C++, which cannot respect Rust’s aliasing rules.

Combined Barriers

To recap:

  • Non-nullness can be disabled with Option<&T>.
  • Initialized-ness can be disabled with &MaybeUninit<T>.
  • Uniqueness can be disabled with &UnsafeCell<T>.

There is no way to disable alignment and validity restrictions: references must always be aligned and have a valid lifetime attached. If these are unachievable, raw pointers are your only option.

We can combine these various “weakenings” to produce aligned, lifetime-bound references to data with different properties. For example:

  • &UnsafeCell<MaybeUninit<T>> is as close as we can get to a C++ T&.
  • Option<&UnsafeCell<T>> is a like a raw pointer, but to initialized memory.
  • Option<&mut MaybeUninit<T>> is like a raw pointer, but with alignment, aliasing, and lifetime requirements.
  • UnsafeCell<&[T]> permits us to mutate the pointer to the buffer and its length, but not the values it points to themselves.
  • UnsafeCell<&[UnsafeCell<T>]> lets us mutate both the buffer and its actual pointer/length.

Interestingly, there is no equivalent to a C++ raw pointer: there is no way to create a guaranteed-aligned pointer without a designated lifetime13.

Other Pointers

Rust and C++ have many other pointer types, such as smart pointers. However, in both languages, both are built in terms of these basic pointer types. Hopefully this article is a useful reference for anyone writing unsafe abstraction that wishes to avoid using raw pointers when possible. ◼

  1. Except in Go, which synthesizes vtables on the fly. Story for another day. 

  2. It is, apparently, a little-known fact that constructing unaligned pointers, but then never dereferencing them, is still UB in C++. C++ could, for example, store information in the lower bits of such a pointer. The in-memory representation of a pointer is actually unspecified! 

  3. This is useful when paired with the Rust <*const T>::read_unaligned() function, which can be compiled down to a normal load on architectures that do not have alignment restrictions, like x86_64 and aarch64. 

  4. Another story for another time. 

  5. Comparable to the C++20 std::span<T> type. 

  6. usize is Rust’s machine word type, compare std::uintptr_t

  7. The length of a *mut [T] can be accessed via the unstable <*mut [T]>::len() method. 

  8. It is also not a type I have encountered enough to have much knowledge on. For example, I don’t actually know if the vtable half of a *mut dyn Tr must always be valid or not; I suspect the answer is “no”, but I couldn’t find a citation for this. 

  9. Note that you cannot continue to use a reference to freed, zero-sized memory. This subtle distinction is called out in

  10. Currently, a transmute must be used to perform this operation, but I see no reason way this would permit us to perform an illegal mutation without uttering unsafe a second time. In particular, MaybeUninit::assume_init_read(), which could be used to perform illegal copies, is an unsafe function. 

  11. A covariant type Cov<T> is once where, if T is a subtype of U, then Cov<T> is a subtype of Cov<U>. This isn’t particularly noticeable in Rust, where the only subtyping relationships are &'a T subtypes &'b T when 'a outlives 'b, but is nonetheless important for advanced type design. 

  12. Cell<T> does not provide synchronization; you still need locks to share it between threads. 

  13. I have previously proposed a sort of 'unsafe or '! “lifetime” that is intended to be the lifetime of dangling references (a bit of an oxymoron). This would allow us to express this concept, but I need to flesh out the concept more. 

Move Constructors in Rust:
Is it possible?

I’ve been told I need to write this idea down – I figure this one’s a good enough excuse to start one of them programming blogs.

TL;DR You can move-constructors the Rust! It requires a few macros but isn’t much more outlandish than the async pinning state of the art. A prototype of this idea is implemented in my moveit crate.

The Interop Problem

Rust is the best contender for a C++ replacement; this is not even a question at this point1. It’s a high-level language that provides users with appropriate controls over memory, while also being memory safe. Rust accomplishes by codifying C++ norms and customs around ownership and lifetimes into its type system.

Rust has an ok2 FFI story for C:

void into_rust();

void into_c() {
extern "C" {
  fn into_c();

extern "C" fn into_rust() {
  unsafe { into_c() }

Calling into either of these functions from the Rust or C side will recurse infinitely across the FFI boundary. The extern "C" {} item on the Rust side declares C symbols, much like a function prototype in C would; the extern "C" fn is a Rust function with the C calling convention, and the #[no_mangle] annotation ensures that recurse_into_rust is the name that the linker sees for this function. The link works out, we run our program, and the stack overflows. All is well.

But this is C. We want to rewrite all of the world’s C++ in Rust, but unfortunately that’s going to take about a decade, so in the meantime new Rust must be able to call existing C++, and vise-versa. C++ has a much crazier ABI, and while Rust gives us the minimum of passing control to C, libraries like cxx need to provide a bridge on top of this for Rust and C++ to talk to each other.

Unfortunately, the C++ and Rust object models are, a priori, incompatible. In Rust, every object may be “moved” via memcpy, whereas in C++ this only holds for types satisfying std::is_trivially_moveable3. Some types require calling a move constructor, or may not be moveable at all!

Even more alarming, C++ types are permited to take the address of the location where they are being constructed: the this pointer is always accessible, allowing easy creation of self-referential types:

class Cyclic {
  Cyclic() {}
  // Ensure that copy and move construction respect the self-pointer
  // invariant:
  Cyclic(const Cyclic&) {
    new (this) Cyclic;
  // snip: Analogous for other rule-of-five constructors.

  Cyclic* ptr_ = this;

The solution cxx and other FFI strategies take is to box up complex C++ objects across the FFI boundary; a std::unique_ptr<Cyclic> (perhaps reinterpreted as a Box on the Rust side) can be passed around without needing to call move constructors. The heap allocation is a performance regression that scares off potential Rust users, so it’s not a viable solution.

We can do better.

Notation and Terminology

“Move” is a very, very overloaded concept across Rust and C++, and many people have different names for what this means. So that we don’t get confused, we’ll establish some terminology to use throughout the rest of the article.

A destructive move is a Rust-style move, which has the following properties:

  • It does not create a new object; from the programmer’s perspective, the object has simply changed address.
  • The move is implemented by a call to memcpy; no user code is run.
  • The moved-from value becomes inaccessible and its destructor does not run.

A destructive move, in effect, is completely invisible to the user4, and the Rust compiler can emit as many or as few of them as it likes. We will refer to this as a “destructive move”, a “Rust move”, or a “blind, memcpy move”.

A copying move is a C++-style move, which has the following properties:

  • It creates a new, distinct object at a new memory location.
  • The move is implemented by calling a user-provided function that initializes the new object.
  • The moved-from value is still accessible but in an “unspecified but valid state”. Its destructor is run once the current scope ends.

A copying move is just a weird copy operation that mutates the copied-from object. C++ compilers may elide calls to the move constructor in certain situations, but calling it usually requires the programmer to explicitly ask for it. From a Rust perspective, this is as if Clone::clone() took &mut self as an argument. We will refer to this as a “copying move”, a “nondestructive move”, a “C++ move”, or, metonymically, as a “move constructor”.

Pinned Pointers

As part of introducing support for stackless coroutines5 (aka async/await), Rust had to provide some kind of supported for immobile types through pinned pointers.

The Pin type is a wraper around a pointer type, such as Pin<&mut i32> or Pin<Box<ComplexObject>>. Pin provides the following guarantee to unsafe code:

Given p: Pin<P> for P: Deref, and P::Target: !Unpin, the pointee object *p will always be found at that address, and no other object will use that address until *p’s destructor is called.

In a way, Pin<P> is a witness to a sort of critical section: once constructed, that memory is pinned until the destructor runs. The Pin documentation goes into deep detail about when and why this matters, and how unsafe code can take advantage of this guarantee to provide a safe interface.

The key benefit is that unsafe code can create self-references behind the pinned pointer, without worrying about them breaking when a destructive move occurs. C++ deals with this kind of type by allowing move/copy constructors to observe the new object’s address and fix up any self references as necessary.

Our progress so far: C++ types can be immoveable from Rust’s perspective. They need to be pinned in some memory location: either on the heap as a Pin<Box<T>>, or on the stack (somehow; keep reading). Our program is now to reconcile C++ move constructors with this standard library object that explicitly prevents moves. Easy, right;


C++ constructors are a peculiar thing. Unlike Rust’s Foo::new()-style factories, or even constructors in dynamic languages like Java, C++ constructors are unique in that they construct a value in a specific location. This concept is best illustraced by the placement-new operation:

void MakeString(std::string* out) {
  new (out) std::string("mwahahaha");

Placement-new is one of those exotic C++ operations you only ever run into deep inside fancy library code. Unlike new, which triggers a trip into the allocator, placement-new simply calls the constructor of your type with this set to the argument in parentheses. This is the “most raw” way you can call a constructor: given a memory location and arguments, construct a new value.

In Rust, a method call is really syntax sugar for Foo::bar(foo). This is not the case in C++; a member function has an altogether different type, but some simple template metaprogramming can flatten it back into a regular old free function:

class Foo {
  int Bar(int x);

inline int FreeBar(Foo& foo, int x) {
  return foo.Bar();

Foo foo;
FreeBar(foo, 5);

Placement-new lets us do the analogous thing for a constructor:

class Foo {
  Foo(int x);

inline void FreeFoo(Foo& foo, int x) {
  new (&foo) Foo(x);

Foo* foo = AllocateSomehow();
FreeFoo(*foo, 5);

We can lift this “flattening” of a specific constructor into Rust, using the existing vocabulary for pushing fixed-address memory around:

unsafe trait Ctor {
  type Output;
  unsafe fn ctor(self, dest: Pin<&mut MaybeUninit<Self::Output>>);

A Ctor is a constructing closure. A Ctor contains the necessary information for constructing a value of type Output which will live at the location *dest. The Ctor::ctor() function performs in-place construction, making *dest become initialized.

A Ctor is not the constructor itself; rather, it is more like a Future or an Iterator which contain the necessary captured values to perform the operation. A Rust type that is constructed using a Ctor would have functions like this:

impl MyType {
  fn new() -> impl Ctor<Output = Self>;

The unsafe markers serve distinct purposes:

  • It is an unsafe trait, because *dest must be initialized when ctor() returns.
  • It has an unsafe fn, because, in order to respect the Pin drop guarantees, *dest must either be freshly allocated or have had its destructor run just prior.

Since we are constructing into Pinned memory, the Ctor implementation can use the address of *dest as part of the construction procedure and assume that that pointer will not suddenly dangle because of a move. This recovers our C++ behavior of “this-stability”.

Unfortunately, Ctor is covered in unsafe, and doesn’t even allocate storage for us. Luckily, it’s not too hard to build our own safe std::make_unique:

fn make_box<C: Ctor>(c: C) -> Pin<Box<Ctor::Output>> {
  unsafe {
    type T = Ctor::Output;
    // First, obtain uninitialized memory on the heap.
    let uninit = std::alloc::alloc(Layout::new<T>());
    // Then, pin this memory as a MaybeUninit. This memory
    // isn't going anywhere, and MaybeUninit's entire purpose
    // in life is being magicked into existence like this,
    // so this is safe.
    let pinned = Pin::new_unchecked(
      &mut *uninit.cast::<MaybeUninit<T>>()
    // Now, perform placement-`new`, potentially FFI'ing into
    // C++.

    // Because Ctor guarantees it, `uninit` now points to a
    // valid `T`. We can safely stick this in a `Box`. However,
    // the `Box` must be pinned, since we pinned `uninit`
    // earlier.

Thus, std::make_unique<MyType>() in C++ becomes make_box(MyType::new()) in Rust. Ctor::ctor gives us a bridging point to call the C++ constructor from Rust, in a context where its expectations are respected. For example, we might write the following binding code:

class Foo {
  Foo(int x);

// Give the constructor an explicit C ABI, using
// placement-`new` to perform the "rawest" construction
// possible.
extern "C" FooCtor(Foo* thiz, int x) {
  new (thiz) Foo(x);
struct Foo { ... }
impl Foo {
  fn new(x: i32) -> impl Ctor<Output = Self> {
    unsafe {
      // Declare the placement-new bridge.
      extern "C" {
        fn FooCtor(this: *mut Foo, x: i32);

      // Make a new `Ctor` wrapping a "real" closure.
      ctor::from_placement_fn(move |dest| {
        // Call back into C++.
        FooCtor(dest.as_mut_ptr(), x)
use foo_bindings::Foo;

// Lo, behold! A C++ type on the Rust heap!
let foo = make_box(Foo::new(5));

But… we’re still on the heap, so we seem to have made no progress. We could have just called std::make_unique on the C++ side and shunted it over to Rust. In particular, this is what cxx resorts to for complex types.

Interlude I: Pinning on the Stack

Creating pinned pointers directly requires a sprinkling of unsafe. Box::pin() allows us to safely create a Pin<Box<T>>, since we know it will never move, much like the make_box() example above. However, it’s not possible to create a Pin<&mut T> to not-necessarilly-Unpin data as easilly:

let mut data = 42;
let ptr = &mut data;
let pinned = unsafe {
  // Reborrow `ptr` to create a pointer with a shorter lifetime.
  Pin::new_unchecked(&mut *ptr)

// Once `pinned` goes out of scope, we can move out of `*ptr`!
let moved = *ptr;

The unsafe block is necessary because of exactly this situation: &mut T does not own its pointee, and a given mutable reference might not be the “oldest” mutable reference there is. The following is a safe usage of this constructor:

let mut data = 42;
// Intentionally shadow `data` so that no longer-lived reference than
// the pinned one can be created.
let data = unsafe {
  Pin::new_unchecked(&mut *data)

This is such a common pattern in futures code that many futures libraries provide a macro for performing this kind of pinning on behalf of the user, such as tokio::pin!().

With this in hand, we can actually call a constructor on a stack-pinned value:

let val = MaybeUninit::uninit();
unsafe { Foo::new(args).ctor(val); }
let val = unsafe {
  val.map_unchecked_mut(|x| &mut *x.as_mut_ptr())

Unfortunately, we still need to utter a little bit more unsafe, but because of Ctor’s guarantees, this is all perfectly safe; the compiler just can’t guarantee it on its own. The natural thing to do is to wrap it up in a macro much like pin!, which we’ll call emplace!:

emplace!(let val = Foo::new(args));

This is truly analogous to C++ stack initialization, such as Foo val(args);, although the type of val is Pin<&mut Foo>, whereas in C++ it would merely bee Foo&. This isn’t much of an obstacle, and just means that Foo’s API on the Rust side needs to use Pin<&mut Self> for its methods.

The Return Value Optimization

Now we go to build our Foo-returning function and are immediately hit with a roadblock:

fn make_foo() -> Pin<&'wat mut Foo> {
  emplace!(let val = Foo::new(args));

What is the lifetime 'wat? This is just returning a pointer to the current stack frame, which is no good. In C++ (ignoring fussy defails about move semantics), NRVO would kick in and val would be constructed “in the return slot”:

Foo MakeFoo() {
  Foo val(args);
  return val;

Return value optimization (and the related named return value optimization) allow C++ to elide copies when constructing return values. Instead of constructing val on MakeFoo’s stack and then copying it into the ABI’s return location (be that a register like rax or somewhere in the caller’s stack frame), the value is constructed directly in that location, skipping the copy. Rust itself performs some limited RVO, though its style of move semantics makes this a bit less visible.

Rust does not give us a good way of accessing the return slot directly, for good reason: it need not have an address! Rust returns all types that look roughly like a single integer in a register (on modern ABIs), and registers don’t have addresses. C++ ABIs typically solve this by making types which are “sufficiently complicated” (usually when they are not trivially moveable) get passed on the stack unconditionally6.

Since we can’t get at the return slot, we’ll make our own! We just need to pass the pinned MaybeUninit<T> memory that we would pass into Ctor::ctor as a “fake return slot”:

fn make_foo(return_slot: Pin<&mut MaybeUninit<Foo>>) -> Pin<&mut Foo> {
  unsafe {
    val.map_unchecked_mut(|x| &mut *x.as_mut_ptr());

This is such a common operation that it makes sense to replace Pin<&mut MaybeUninit<T>> with a specific type, Slot<'a, T>:

struct Slot<'a, T>(Pin<&'a mut MaybeUninit<T>>);
impl<'a, T> Slot<'a, T> {
  fn emplace<C: Ctor>(c: C) -> Pin<&'a mut T> {
    unsafe {
      val.map_unchecked_mut(|x| &mut *x.as_mut_ptr());

fn make_foo(return_slot: Slot<Foo>) -> Pin<&mut Foo> {

We can provide another macro, slot!(), which reserves pinned space on the stack much like emplace!() does, but without the construction step. Calling make_foo only requires minimal ceremony and no user-level unsafe.

let foo = make_foo(foo);

The slot!() macro is almost identical to tokio::pin!(), except that it doesn’t initialize the stack space with an existing value.

Towards Move Constructors: Copy Constructors

Move constructors involve rvalue references, which Rust has no meaningful equivalent for, so we’ll attack the easier version: copy constructors.

A copy constructor is C++’s Clone equivalent, but, like all constructors, is allowed to inspect the address of *this. Its sole argument is a const T&, which has a direct Rust analogue: a &T. Let’s write up a trait that captures this operation:

unsafe trait CopyCtor {
  unsafe fn copy_ctor(src: &Self, dest: Pin<&mut MaybeUninit<Self>>);

Unlike Ctor, we would implement CopyCtor on the type with the copy constructor, bridging it to C++ as before. We can then define a helper that builds a Ctor for us:

fn copy<T: CopyCtor>(val: &T) -> impl Ctor<Output = T> {
  unsafe {
    ctor::from_placement_fn(move |dest| {
      T::copy_ctor(val, dest)

emplace!(let y = copy(x));     // Calls the copy constructor.
let boxed = make_box(copy(y)); // Copy onto the heap.

We can could (modulo orphan rules) even implement CopyCtor for Rust types that implement Clone by cloneing into the destination.

It should be straightforward to make a version for move construction… but, what’s a T&& in Rust?

Interlude II: Unique Ownership

Box<T> is interesting, because unlike &T, it is possible to move out of a Box<T>, since the compiler treats it somewhat magically. There has long been a desire to introduce a DerefMove trait captures this behavior, but the difficulty is the signature: if deref returns &T, and deref_mut returns &mut T, should deref_move return T? Or something… more exotic? You might not want to dump the value onto the stack; you want *x = *y to not trigger an expensive intermediate copy, when *y: [u8; BIG].

Usually, the “something more exotic” is a &move T or &own T reference that “owns” the pointee, similar to how a T&& in C++ is taken to mean that the caller wishes to perform ownership transfer.

Exotic language features aside, we’d like to be able to implement something like DerefMove for move constructors, since this is the natural analogue of T&&. To move out of storage, we need a smart pointer to provide us with three things:

  • It must actually be a smart pointer (duh).
  • It must be possible to destroy the storage without running the destructor of the pointee (in Rust, unlike in C++, destructors do not run on moved-from objects).
  • It must be the unique owner of the pointee. Formally, if, when p goes out of scope, no thread can access *p, then p is the unique owner.

Box<T> trivially satisfies all three of these: it’s a smart pointer, we can destroy the storage using std::alloc::dealloc, and it satisfies the unique ownership property.

&mut T fails both tests: we don’t know how to destory the storage (this is one of the difficulties with a theoretical &move T) and it is not the unique owner: some &mut T might outlive it.

Interestingly, Arc<T> only fails the unique ownership test, and it can pass it dynamically, if we observe the strong and weak counts to both be 1. This is also true for Rc<T>.

Most importantly, however, is that if Pin<P>, then it is sufficient that P satisfy these conditions. After all, a Pin<Box<P>> uniquely owns its contents, even if they can’t be moved.

It’s useful to introduce some traits that record these requirements:

unsafe trait OuterDrop {
  unsafe fn outer_drop(this: *mut Self);

unsafe trait DerefMove: DerefMut + OuterDrop {}

OuterDrop is simply the “outer storage destruction” operation. Naturally, it is only safe to perform this operation when the pointee’s own destructor has been separately dropped (there are some subtleties around leaking memory here, but in general it’s not a good idea to destroy storage without destroying the pointee, too).

DerefMove7 is the third requirement, which the compiler cannot check (there’s a lot of these, huh?). Any type which implements DerefMove can be moved out of by carefully dismantling the pointer:

fn move_out_of<P>(mut p: P) -> P::Target
  P: DerefMove,
  P::Target: Sized + Unpin,
  unsafe {
    // Copy the pointee out of `p` (all Rust moves are
    // trivial copies). We need `Unpin` for this to be safe.
    let val = (&mut *p as *mut P::Target).read();
    // Destroy `p`'s storage without running the pointee's
    // destructor.
    let ptr = &mut p as *mut P;
    // Make sure to suppress the actual "complete" destructor of
    // `p`.
    // Actually destroy the storage.
    // Return the moved pointee, which will be trivially NRVO'ed.

Much like pinning, we need to lift this capability to the stack somehow. &mut T won’t cut it here.

Owning the Stack

We can already speak of uninitialized but uniquely-owned stack memory with Slot, but Slot::emplace() returns a (pinned) &mut T, which cannot be DerefMove. This operation actually loses the uniqueness information of Slot, so instead we make emplace() return a StackBox.

A StackBox<'a, T> is like a Box<T> that’s bound to a stack frame, using a Slot<'a, T> as underlying storage. Although it’s just a &mut T on the inside, it augments it with the uniqueness invariant above. In particular, StackBox::drop() is entitled to call the destructor of its pointee in-place.

To the surprise of no one who has read this far, StackBox: DerefMove. The implementation for StackBox::outer_drop() is a no-op, since the calling convention takes care of destroying stack frames.

It makes sense that, since Slot::emplace() returns a Pin<StackBox<T>>, so should emplace!().

(There’s a crate called stackbox that provides similar StackBox/Slot types, although it is implemented slightly differently and does not provide the pinning guarantees we need.)

Move Constructors

This is it. The moment we’ve all be waiting for. Behold, the definition of a move constructor in Rust:

unsafe trait MoveCtor {
  unsafe fn move_ctor(
    src: &mut Self,
    dest: Pin<&mut MaybeUninit<Self>>

Wait, that’s it?

There’s no such thing as &move Self, so, much like drop(), we have to use a plain ol’ &mut instead. Like Drop, and like CopyCtor, this function is not called directly by users; instead, we provide an adaptor that takes in a MoveCtor and spits out a Ctor.

fn mov<P>(mut ptr: P) -> impl Ctor<Output = P::Target>
  P: DerefMove,
  P::Target: MoveCtor,
  unsafe {
    from_placement_fn(move |dest| {
      MoveCtor::move_ctor(&mut *ptr, dest);

      // Destroy `p`'s storage without running the pointee's
      // destructor.
      let inner = &mut ptr as *mut P;

Notice that we no longer require that P::Target: Unpin, since the ptr::read() call from move_out_of() is now gone. Instead, we need to make a specific requirement of MoveCtor that I will explain shortly. However, we can now freely call the move constructor just like any other Ctor:

emplace!(let y = mov(x));  // Calls the move constructor.
let boxed = make_box(mov(y));  // Move onto the heap.

The Langauge-Lawyering Part

(If you don’t care for language-lawyering, you can skip this part.)

Ok. We need to justify the loss of the P::Target: Unpin bound on mov(), which seems almost like a contradiction: Pin<P> guarantees its pointee won’t be moved, but isn’t the whole point of MoveCtor to perform moves?

At the begining of this article, I called out the difference between destructive Rust move and copying C++ moves. The reason that the above isn’t a contradiction is that the occurences of “move” in that sentence refer to these different senses of “move”.

The specific thing that Pin<P> is protecting unsafe code from is whatever state is behind the pointer being blindly memcpy moved to another location, leaving any self-references in the new location dangling. However, by invoking a C++-style move constructor, the data never “moves” in the Rust sense; it is merely copied in a way that carefully preserves any address-dependent state.

We need to ensure two things:

  • Implementors of MoveCtor for their own type must ensure that their type does not rely on any pinning guarantees that the move constructor cannot appropriately “fix up”.
  • No generic code can hold onto a reference to moved-from state, because that way they could witness whatever messed-up post-destruction state the move constructor leaves it in.

The first of these is passed onto the implementor as an unsafe impl requirement. Designing an !Unpin type by hand is difficult, and auto-generated C++ bindings using this model would hopefully inherit move-correctness from the C++ code itself.

The second is more subtle. In the C++ model, the moved-from value is mutated to mark it as “moved from”, which usually just inhibits the destructor. C++ believes all destructors are run for all objects. For example, std::unique_ptr sets the moved-from value to nullptr, so that the destructor can be run at the end of scope and do nothing. Compare with the Rust model, where the compiler inhibits the destructor automatically through the use of drop flags.

In order to support move-constructing both Rust and C++ typed through a uniform interface, move_ctor is a fused destructor/copy operation. In the Rust case, no “destructor” is run, but in the C++ case we are required to run a destructor. Although this changes the semantic ordering of destruction compared to the equivalent C++ program, in practice, no one depends on moved-from objects actually being destroyed (that I know of).

After move_ctor is called, src must be treated as if it had just been destroyed. This means that the storage for src must be disposed of immediately, without running any destructors for the pointed-to value. Thus, no one must be able to witness the messed-up pinned state, which is why mov() requires P: DerefMove.

Thus, no code currently observing Pin<P> invariants in unsafe code will notice anything untoward going on. No destructive moves happen, and no moved-from state is able to hang around.

I’m pretty confident this argument is correct, but I’d appreciate some confirmation. In particular, someone involved in the UCG WG or the Async WG will have to point out if there are any holes.

The Upshot

In the end, we don’t just have a move constructors story, but a story for all kinds of construction, C++-style. Not only that, but we have almost natural syntax:

emplace! {
  let x = Foo::new();
  let y = ctor::mov(x);
  let z = ctor::copy(y);

// The make_box() example above can be added to `Box` through
// an extension trait.
let foo = Box::emplace(Foo::new());

As far as I can tell, having some kind of “magic” around stack emplacement is unavoidable; this is a place where the language is unlikely to give us enough flexibility any time soon, though this concept of constructors is the first step towards such a thing.

We can call into C++ from Rust without any heap allocations at all (though maybe wasting an instruction or two shunting pointers across registers for our not-RVO):

/// Generated Rust type for bridging to C++, like you might get from `cxx`.
struct Foo { ... }
impl Foo {
  pub fn new(x: i32) -> impl Ctor { ... }
  pub fn set_x(self: Pin<&mut Self>, x: i32) { ... }

fn make_foo(out: Slot<Foo>) -> Pin<StackBox<Foo>> {
  let mut foo = out.emplace(Foo::new(42));

For when dealing with slots explicitly is too much work, types can just be ctor::moved into a Box with Box::emplace.

I’ve implemented everything discussed in this post in a crate, moveit. Contributions and corrections are welcome.

A thanks to Manish Goregaokar, Alyssa Haroldson, and Adrian Taylor for feedback on early versions of this design.

Future Work

This is only the beginning: much work needs to be done in type design to have a good story for bridging move-only types from C++ to Rust, preferably automatically. Ctors are merely the theoretical foundation for building a more ergonomic FFI; usage patterns will likely determine where to go from here.

Open questions such as “how to containers” remain. Much like C++03’s std::auto_ptr, we have no hope of putting a StackBox<T> into a Vec<T>, and we’ll need to design a Vec variant that knows to call move constructors when resizing and copy constructors when cloning. There’s also no support for custom move/copy assignment beyond the trivial new (this) auto(that) pattern, and it’s unclear whether that’s useful. Do we want to port a constructor-friendly HashMap (Rust’s swisstable implementation)? Do we want to come up with macros that make dealing with Slot out-params less cumbersome?

Personally, I’m excited. This feels like a real breakthrough in one of the biggest questions for true Rust/C++ interop, and I’d like to see what people wind up building on top of it. ◼

  1. This isn’t exactly a universal opinion glances at Swift but it is if you write kernel code like me. 

  2. You can’t just have rustc consume a .h and spit out bindings, like e.g. Go can, but it’s better than the disaster that is JNI. 

  3. Some WG21 folks have tried to introduce a weaker type-trait, std::is_trivially_relocatable, which is a weakening of trivally moveable that permits a Rust-style destructive move. The libc++ implementation of most STL types, like std::unique_ptr, admit this trait. 

  4. A lot of unsafe Rust code assumes this is the only kind of move. For example, mem::swap() is implemented using memcpy. This is unlike the situation in C++, where types will often provide custom std::swap() implementations that preserve type invariants. 

  5. Because Future objects collapse their stack state into themselves when yielding, they may have pointers into themselves (as a stack typically does). Thus, Futures need to be guaranteed to never move once they begin executing, since Rust has no move constructors and no way to fix up the self-pointers. 

  6. Among other things, this means that std::unique_ptrs are passed on the stack, not in a register, which is very wasteful! Rust’s Box does not have this issue. 

  7. Rust has attempted to add something like DerefMove many times. What’s described in this post is nowhere near as powerful as a “real” DerefMove would be, since such a thing would also allow moving into a memory location.