Almost a year ago I developed the
moveit Rust library, which provides primitives for expressing something like C++’s
T&& and move constructors while retaining Rust’s so-called “destructive move property”: moving a value transfers ownership, rather than doing a funny copy.
In an earlier blogpost I described the theory behind this library and some of the motivation, which I feel fairly confident about, especially in how constructors (and their use of pinning) are defined.
However, there is a problem.
The old post is somewhat outdated, since
moveit uses different names for a lot of things that are geared to fit in with the rest of Rust.
The core abstraction of
moveit is the constructor, which are types that implement the
New type is not what is being constructed; rather, it represents a method of construction, resembling a specialized
Fn trait. The constructed type is given by the associated type
Types that can be constructed are constructed in place, unlike most Rust types. This is a property shared by constructors in C++, allowing values to record their own address at the moment of creation. Explaining why this is useful is a bit long-winded, but let’s assume this is a thing we want to be able to do. Crucially, we need the output of a constructor to be pinned, which is why the
this output parameter is pinned.
Calling a constructor requires creating the output location in advance so that we can make it available to it in time:
However, this is not quite right.
Pin<P>’s docs are quite clear that we must ensure that, once we create an
Pin<&mut T>, we must call
T’s destructor before its memory is re-used; since reuse is unavoidable for stack data, and
storage will not do it for us (it’s a
MaybeUninit<T>, after all), we must somehow run the destructor separately.
One trick we could use is to replace
storage with some kind of wrapper over a
MaybeUninit<T> that calls the destructor for us:
This works, but isn’t ideal, because now we can’t write down something like a C++ move constructor without running into the classic C++ problem: all objects must be destroyed unconditionally, so now you can have moved-from state. Giving up Rust’s moves-transfer-ownership (i.e. affine) property is bad, but it turns out to be avoidable!
There are also some scary details around panics here that I won’t get into.
moveit instead provides a
MoveRef<'frame, T> type that tries to capture the notion of what an “owning reference” could mean in Rust. An
&own type has been discussed many times, but implementing it in the full generality it would deserve as a language feature runs into some interesting problems due to how
Box<T>, the heap allocated equivalent, currently behaves.
We can think of
MoveRef<'frame, T> as wrapping the longest-lived
&mut T reference pointing to a particular location in memory. The longest-lived part is crucial, since it means that
MoveRef is entitled to run its pointee’s destructor:
No reference to the pointee can ever outlive the
MoveRef itself, by definition, so this is safe. The owner of a value is that which is entitled to destroy it, and therefore a
MoveRef literally owns its pointee. Of course, this means we can move out of it (which was the whole point of the original blogpost).
Because of this, we are further entitled to arbitrarily pin a
MoveRef with no consequences: pinning it would consume the unpinned
MoveRef (for obvious reasons,
MoveRefs cannot be reborrowed) so no unpinned reference may outlive the pinning operation.
This gives us a very natural solution to the problem above:
result should not be a
Pin<&mut T>, but rather a
This messy sequence of steps is nicely wrapped up in a macro provided by the library that ensures safe initialization and eventual destruction:
There is also some reasonably complex machinery that allows us to do something like an owning
Deref, which I’ll come back to in a bit.
However, there is a small wrinkle that I did not realize when I first designed
MoveRef: what happens if I
Quashing destruction isn’t new to Rust: we can
mem::forget just about anything, leaking all kinds of resources. And that’s ok! Destructors alone cannot be used in type design to advert
unsafe catastrophe, a well-understood limitation of the language that we have experience designing libraries around, such as
MoveRef’s design creates a contradiction:
MoveRefis an owning smart pointer, and therefore can be safely pinned, much like
Box::into_pinned()enables. Constructors, in particular, are designed to generate pinned
- Forgetting a
MoveRefwill cause the pointee destructor to be suppressed, but its storage will still be freed and eventually re-used, a violation of the
This would appear to mean that a design like
MoveRef is not viable at all, and that this sort of “stack box” strategy is always unsound.
What about it? Even though we can trivially create a
Box::pin(), this is a red herring. When we
Box, we also forget about its storage too. Because its storage has been leaked unrecoverably, we are still, technically, within the bounds of the
Pincontract. Only barely, but we’re inside the circle.
Interestingly, the Rust language has to deal with a similar problem; perhaps it suggests a way out?
Carefully crafted Rust code emits some very interesting assembly. I’ve annotated the key portion of the output with a play-by-play below.
The upshot is that
maybe_drop conditions the destructor of
x on a flag, which is allocated next to it on the stack. Rust flips this flag when the value is moved into another function, and only runs the destructor when the flag is left alone. In this case, LLVM folded the flag into the
bool argument, so this isn’t actually a meaningful perf hit.
These “drop flags” are key to Rust’s ownership model. Since ownership may be transferred dynamically due to reasonably complex control flow, it needs to leave breadcrumbs for itself to figure out whether the value wound up getting moved away or not. This is unique to Rust: in C++, every object is always destroyed, so no such faffing about is necessary.
moveit can close this soundness hole by leaving itself breadcrumbs to determine if safe code is trying to undermine its guarantees.
In other words: in Rust, it is not sufficient to manage a pointer to manage a memory location; it is necessary to manage an explicit or implicit drop flag as well.
We can extend
MoveRef to track an explicit drop flag:
Wrapping it in a
Cell is convenient and doesn’t cost us anything, since a
MoveRef can never be made
Sync anyways. Inside of its destructor, we can flip the flag, much like Rust flips a drop flag when transferring ownership to another function:
But, how should we use it? The easiest way is to change the definition of
moveit!() to construct a flag trap:
The trap is a deterrent against forgetting a
MoveRef: because the
MoveRef’s destructor flips the flag, the trap’s destructor will notice if this doesn’t happen, and take action accordingly.
moveit, this is actually implemented by having the
Slot<T>type carry a reference to the trap, created in the
slot!()macro. However, this is not a crucial detail for the design.
The trap is another RAII type that basically looks like this:
The trap is simple: if the contained drop flag is not flipped, it crashes the program. Because
moveit!() allocates it on the stack where uses cannot
mem::forget it, its destructor is guaranteed to run before
storage’s destructor runs (although Rust does not guarantee destructors run, it does guarantee their order).
MoveRef is forgotten, it won’t have a chance to flip the flag, which the trap will detect. Once the trap’s destructor notices this, it cannot return, either normally or by panic, since this would cause
storage to be freed. Crashing the program is the only1 acceptable response.
MoveRef’s functions need to be adapted to this new behavior: for example,
MoveRef::into_inner() still needs to flip the flag, since moving out of the
MoveRef is equivalent to running the destructor for the purposes of drop flags.
In order for
MoveRef to be a proper “new” reference type, and not just a funny smart pointer, we also need a
This is the original design for
DerefMove, which had a two-phase operation: first
deinit() was used to create a destructor-suppressed version of the smart pointer that would only run the destructor for the storage (e.g., for
Box, only the call to
deref_move() would extract the “inner pointee” out of it as a
MoveRef. This had the effect of splitting the smart pointer’s destructor, much like we did above on the stack.
This has a number of usability problems. Not only does it need to be called through a macro, but
deinit() isn’t actually safe: failing to call
deref_move() is just as bad as calling
mem::forget on the result. Further, it’s not clear where to plumb the drop flags through.
After many attempts to graft drop flags onto this design, I replaced it with a completely new interface:
Uninit has been given the clearer name of
Storage: a type that owns just the storage of the moved-from pointer. The two functions were merged into a single, safe function that performs everything in one step, emitting the storage as an out-parameter.
DroppingSlot<T> is like a
Slot<T>, but closer to a safe version of the
EventuallyInit<T> type from earlier: its contents are not necessarily initialized, but if they are, it destroys them, and it only does so when its drop flag is set.
Box is the most illuminating example of this trait:
MoveRef’s own implementation illustrates the need for the explicit lifetime bound:
Since this is fundamentally a lifetime narrowing, this can only compile if we insist that
'a: 'frame, which is implied by
Self: 'frame. Earlier iterations of this design enforced it via a
MoveRef<'frame, Self> receiver, which turned out to be unnecessary.
As of writing, I’m still in the process of self-reviewing this change, but at this point I feel reasonably confident that it’s correct; this article is, in part, written to convince myself that I’ve done this correctly.
The new design will also enable me to finally complete my implementation of a constructor and pinning-friendly vector type; this issue came up in part because the vector type needs to manipulate drop flags in a complex way. For this reason, the actual implementation of drop flags actually uses a counter, not a single boolean.
I doubt this is the last issue I’ll need to chase down in
moveit, but for now, we’re ever-closer to true owning references in Rust. ◼
Arguably, running the skipped destructor is also a valid remediation strategy. However, this is incompatible with what the user requested: they asked for the destructor to be supressed, not for it to be run at a later date. This would be somewhat surprising behavior, which we could warn about for the benefit of
unsafecode, but ultimately the incorrect choice for non-stack storage, such as a
MoveRefreferring to the heap. ↩