Rust and C++ both have very similar operational semantics for their “anonymous function” expressions (they call them “closures” and “lambdas” respectively; I will use these interchangably). Here’s what those expressions look like.
The type of square
in both versions is an anonymous type that holds the captures for that closure. In C++, this type provides an operator()
member that can be used to call it, wheras in Rust, it implements FnOnce
(and possibly FnMut
and Fn
, depending on the captures), which represent a “callable” object.
For the purposes of this article, I am going to regard “function item values” as being identical to closures that explicitly specify their inputs and outputs for all intents and purposes. This is not completely accurate, because when I write
let x = drop;
, the resulting object is generic, but whenever I say “a closure” in Rust, I am also including these closure-like types too.
There is one thing C++ closures can express which Rust closures can’t: you can’t create a “generic” closure in Rust. In particular, in C++ we can write this code.
The auto
keyword in a closure in C++ does not work like in Rust. In Rust, if try to write “equivalent” code, let x = |val| val.len();
, on its own, we get this error:
This is because in Rust, a closure argument without a type annotation means “please deduce what this should be”, so it participates in Rust’s type inference, wheras in C++ an auto
argument means “make this a template parameter of operator()
”.
How would we implement CallMany
in Rust, anyways? We could try but we quickly hit a problem:
What should we put in the ???
? It can’t be a type parameter of call_many
, since that has a concrete value in the body of the function. We want to say that Fn
can accept any argument that implements len
. There isn’t even syntax to describe this, but you could imagine adding a version of for<...>
that works on types, and write something like this.
The imaginary syntax for<T: Len> Fn(&T) -> usize
means “implements Fn
for all all types T
that implement Len
”. This is a pretty intense thing to ask rustc to prove. It is not unachievable, but it would be hard to implement.
For the purposes of this article, I am going to consider
for<T>
a plausible, if unlikely, language feature. I will neither assume it will ever happen, nor that we should give up on ever having it. This “middle of uncertainty” is important to ensure that we do not make adding this feature impossible in the discussion that follows.
A Workaround
Let’s examine the Fn
trait, greatly simplified.
Fn::call
is analogous to operator()
in C++. When we say that we want a “generic closure”, we mean that we want to instead have a trait that looks a bit more like this:
Notice how Args
has moved from being a trait parameter to being a function parameter, and Output
now depends on it. This is a slightly different formulation from what we described above, because we are no longer demanding an infinitude of trait implementations, but now the implementation of one trait with a generic method.
For our specific example, we want something like this.
This compiles and expresses what we want precisely: we want to call f
on arbitrary impl Len
types.
But how do we call call_many
? That starts to get pretty ugly.
This has the potential to get really, really ugly. I used this pattern for a non-allocating visitor I wrote recently, and it wasn’t pretty. I had to write a macro to cut down on the boilerplate.
This macro is, unsurprisingly, quite janky. It also can’t really do captures, because the $cb
argument that contains the actual code is buried inside of a nested impl
.
You might think “well Miguel, why don’t you hoist $cb
into the Cb
struct?” The problem is now that I need to write impl<'s, F: FnMut(&Parser<'s>, ???)>
so that I can actually call the callback in the body of Resume::resume
, but that brings us back to our trait bound problem from the start!
This is a general problem with this type of solution: there is no macro you can write that will capture an arbitrary closure to implement a trait by calling that closure, if the method being implemented is generic, because if you could, I wouldn’t have to bother with the macro.
Let’s Talk About Java
Java gets a bad rap but the core language does have some interesting features in it. A very handy one is an anonymous class.
Let’s suppose I want to pass a callback into something. In Java 6, which I grew up on, you did it like this:
The new Interface() {...}
syntax mints a new class on the spot that implements Interface
. You provide a standard class body between the braces, after the name of the type. You can also do this with a class type too.
Now, this is a bit tedious: I need to re-type the signature of the one method. This is fine if I need to implement a bunch of methods, but it’s a little annoying in the one-method case.
In Java 8 we got lambdas (syntax: x -> expr
). Java made the interesting choice of not adding a Function
type to be “the type of lambdas”. For a long time I thought this was a weird cop-out but I have since come to regard it as a masterclass in language design.
Instead, Java’s lambdas are a sort of syntax sugar over this anonymous class syntax.1 Instead, you need to assign a lambda to an interface type with a single abstract method, and it will use the body of the lambda to implement that one method.
Interfaces compatible with lambdas are called single abstract method (SAM) interfaces.
So, without needing to touch the existing library, I can turn the new
syntax into this:
chef’s kiss
Mind, Java does provide a mess of “standard function interfaces” in the java.util.functional
package, and quite a bit of the standard library uses them, but they don’t need to express the totality of functions you might want to capture as objects.
These “SAM closures” give closures a powerful “BYO interface” aspect. Lambdas in Java are not “function objects”, they are extremely lightweight anonymous classes the pertinent interface.
I think this can let us cut the gordian knot of generic closures in Rust.
SAM in Rust
In what remains I will propose how we can extend the traits that closures implement to be any SAM trait, in addition to the traits they implement ipso facto.
What’s a SAM trait in Rust? It’s any trait T
with precisely ONE method that does not have a default implementation, which must satisfy the following constraints:
- It must have a
self
parameter with typeSelf
,&Self
, or&mut Self
. - It does not mention
Self
in any part of its argument types, its return type, or itswhere
clauses, except for the aforementionedself
parameter. - Has no associated consts and no GATs.
- All of its supertraits are
Copy
,Send
, orSync
.
These restrictions are chosen so that we have a shot at actually implementing the entire trait.
In addition to the
Fn
traits, ordinary closures automatically implementClone
,Copy
,Send
, andSync
as appropriate.None of these traits are SAM, so we can safely allow them to be automatically derived for SAM closures to, under the same rules as for ordinary closures.
To request a SAM closure, I will use the tentative syntax of impl Trait |args| expr
. This syntax is unambiguously an expression rather than an impl
item, because a |
cannot appear in a path-in-type, and impl $path
must be followed by {
, for
or where
. The precise syntax is unimportant.
Applied to the call_many
example above, we get this.
The compiler rewrites this into something like this.
This rewrite can happen relatively early, before we need to infer a type for x
. We also need to verify that this trait’s captures are compatible with an &self
receiver The same rules for when a trait implements Fn
, FnMut
, and FnOnce
would decide which of the three receiver types the closure is compatible with.
Note that SAM closures WOULD NOT implement any Fn
traits.
More Complicated Examples
We are required to name the trait we want but its type parameters can be left up in the air. For example:
In general, unspecified parameters and associated types result in inference variables, which are resolved in the same way as the parameters of the Fn
closures are.
In fact, we can emulate ordinary closures using SAM closures.
Note that because Fn
and FnMut
have non-trival supertraits we can’t make them out of SAM closures.
One application is to completely obsolete std::iter::from_fn
.
Or, if you need a quick helper implementation of Debug
…
There are probably additional restrictions we will want to place on the SAM trait, but it’s not immediately clear what the breadth of those are. For example, we probably shouldn’t try to make this work:
There are definitely clever tricks you can play to make this work, but the benefit seems slim.
Future Work
There’s two avenues for how we could extend this concept. The first is straightforward and desireable; the second is probably unimplementable.
Anonymous Trait Impls
Backing up from the Java equivalent of lambdas, it seems not unreasonable to have a full-fledged expression version of impl
that can make captures.
Syntactically, I will use impl Trait for { ... }
. This is currently unambiguous, although I think that making it so that {
cannot start a type is probably a non-starter.
Let’s pick something mildly complicated… like Iterator
with an overriden method. Then we might write something like this.
The contents of the braces after for
is an item list, except that variables from the outside are available, having the semantics of captures; they are, in effect, accesses of self
without the self.
prefix.
Hammering out precisely how this would interact with the self types of the functions in the body seems… complicated. Pretty doable, just fussy. There are also awkward questions about what Self
is here and to what degree you’re allowed to interact with it.
Trait Inference
Suppose that we could instead “just” write impl |x| x * x
and have the compiler figure out what trait we want (to say nothing of making this the default behavior and dropping the leading impl
keyword).
This means that I could just write
We get into trouble fast.
If the type of x
implements T2
directly, we print "hello"
, but if we decide it implements T1
instead, it doesn’t, because we get the blanket impl. If it decides it should implement both… we get a coherence violation.
Currently, rustc does not have to produce impls “on demand”; the trait solver has a finite set of impls to look at. What we are asking the trait solver to do is to, for certain types, attempt to reify impls based on usage. I.e., I have my opaque closure type T
and I the compiler decided it needed to prove a T: Foo
bound so now it gets to perform type checking to validate whether it has an impl
.
This seems unimplementable with how the solver currently works. It is not insurmountable! But it would be very hard.
It is possible that there are relaxations of this that are not insane to implement, e.g. the impl ||
expression is used to initialize an argument to a function that happens to be generic, so we can steal the bounds off of that type variable and hope it’s SAM. But realistically, this direction is more trouble than its worth.
Conclusion
Generic lambdas are extremely powerful in C++, and allow for very slick API designs; I often miss them in Rust. Although it feels like there is an insurmountable obstruction, I hope that the SAM interface approach offers a simpler, and possibly more pragmatic, approach to making them work in Rust.
-
Except for the part where they are extremely not. Where
new T() {}
mints a brand new class and accompanying.class
file, Java lambdas use this complicated machinery from Java 7 to generate method handles on the fly, via theinvokedynamic
JVM instruction. This, I’m told, makes them much easier to optimize. ↩