Why Rust Closures are (Somewhat) Hard
The Easy Case: Lua
Since hard is always relative to something else, I’d like to start with a dynamic language. Functions in Lua are essentially anonymous and can capture variables from their environment - much like with JavaScript, which Lua resembles in several important ways):
local m,c = 1,2
local line = function(x) return m*x + c end
print(line(0)) -- 2
print(line(1)) -- 3
function call(f,x)
return f(x)
end
print(call(line,1)) --3
That’s cool and useful - we have created a function and bound
it to the variable line
. These function values can be passed around,
since functions are first class objects in Lua:
Since functions are values, it is easy to write functions which return functions - higher-order functions:
function make_line(m,c)
return function (x)
return m*x + c
end
end
local line = make_line(1,2)
-- as before
Rust closures are harder for three main reasons:
The first is that it is both statically and strongly typed, so we’ll need to explicitly annotate these function types.
Second, Lua functions are dynamically allocated (‘boxed’.) Rust does not allocate silently because it prefers to be explicit and is a system language designed for maximally efficient code.
Third, closures share references with their environment. In the case of Lua, the garbage collector ensures that these references will live long enough. With Rust, the borrow checker needs to be able to track the lifetimes of these references.
The Easy Case: Rust
The notation for Rust closures is very concise in comparison:
let m = 1.0;
let c = 2.0;
let line = |x| m*x + c;
println!("{} {}", line(0.0), line(1.0));
Two points need emphasis. The first is that closures are quite distinct from
plain functions - I can define a function line
here but it will not share
references to the local variables m
and c
. The second is that the argument
and return type are established by type inference. As soon as we say line(0.0)
the compiler knows that this closure takes a f64
and returns an f64
. If I
subsequently try to call line(1)
it will complain because no way Rust will
convert an integer into a float without a typecast.
Rust has a very similar attitude to C++ here - that closures should be a zero-overhead abstraction. Here we have a vector of integers, and wish to know how many elements are greater than zero:
let k = my_vec.iter().filter(|n| **n > 0).count();
The guarantee is that this will be just as fast as writing out an explicit loop!
What is the type of n
? iter()
produces a iterator over references to the
elements (say &i32
) and filter
takes a closure which is passed a reference to
the iterator type - so it will be &&i32
in this case. So we need a double dereference
to get the actual integer - Rust references don’t automagically dereference themselves,
except if a method is called. (A more idiomatic way of writing this is |&&n| n > 0
.)
So, although type inference saves us typing and makes iterator expressions much less noisy,
it does not not save us from having to know the type.
The way that Rust (and C++) avoids overhead in this most important case is not to box closures. This means that they can be inlined.
How Rust Closures are Implemented
Our first little example is syntactical sugar for creating a struct:
struct SomeUnknownType<'a> {
m: &'a f64,
c: &'a f64
}
impl <'a>SomeUnknownType<'a> {
// pseudo-code - note the 'call operator' is a _method_
fn call(&self, x: f64) -> f64 {
self.m * x + self.c
}
}
let s = SomeUnknownType{m: &1.0, c: &2.0};
assert_eq!(s.call(1.0), 3.0);
The actual concrete type is not known to the
program. You will only see this type in error messages - here we use a common
trick to provoke rustc
into telling us the actual type of the right-hand side:
27 | let f: () = |x| m*x + c;
| ^^^^^^^^^^^ expected (), found closure
|
= note: expected type `()`
found type `[closure@/home/steve/closure1.rs:27:17: 27:28 m:_, c:_]`
This a struct that borrows references to its environment and so must have a lifetime.
This is pretty much what happens with C++ lambdas, where the actual generated type
is unknown. (This is one of the reasons why auto
has become so important in
modern C++, because not all types can be expressed explicitly). With C++ one
has detailed control about what gets captured by reference or by copy; with Rust
by default everything is borrowed unless it will be consumed within the
closure.
let a = "hello".to_string();
let fs = || {
let b = a;
println!("we got {}", b);
};
println!("a is {}", a);
// 25 | let fs = || {
// | -- value moved (into closure) here
//...
//29 | println!("a is {}", a);
// | ^ value used here after move
In this case, a
has to move into b
, and then gets dropped.
In the case of |x| m*x + c
, we aren’t consuming m
or c
, so these are references.
However, if we said move |x| m*x + c
then the struct would look like this:
struct AnotherUnknownType {
m: f64,
c: f64
}
impl AnotherUnknownType {
fn call(&self, x: f64) -> f64 {
self.m * x + self.c
}
}
The environment has been captured by copying m
and c
- no borrowing and
no lifetime associated with borrowing.
So there’s no magic involved with Rust closures - just sugar.
These are obviously very different types - what do they have in common?
They both match the trait bound Fn(f64)->f64
. Here is a generic function
which simply invokes the closure it has been given:
fn invoke<F>(f: F, x: f64) -> f64
where F: Fn(f64)->f64 {
f(x)
}
With Rust 1.26, there’s a simpler but completely equivalent notation:
fn new_invoke(f: impl Fn(f64)->f64, x: f64) -> f64 {
f(x)
}
Either way, the type bound for the f
argument reads:
f
is any type that implements Fn(f64)->f64
.
Now, if we wanted to store closures, we have to box them. We can store
any functions matching Fn(f64)->f64
as Box<Fn(f64)->f64>
. A boxed closure
is also callable, but has a known fixed size because it is a pointer to the
closure struct, so you can put these boxes into a vector or a map.
It is equivalent to the std::function
type in C++, which also involves
calling a virtual method. Box<Fn(f64)->f64>
is a Rust trait object.
These are three function traits in Rust, which correspond to the three kinds of methods (remember that calling a closure is executing a method on a struct)
Fn
call is&self
methodFnMut
call is&mut self
methodFnOnce
call isself
method
Implications of “Closures are Structs”
|x| m*x + c
has a lifetime tied to that of m
and c
.
It cannot live longer than these variables. The lifetime appears explictly
when we try to store closures in boxes:
struct HasAClosure<'a> {
closure: Box<Fn(f64)->f64 + 'a>
}
impl <'a>HasAClosure<'a> {
fn new<C>(f: C) -> HasAClosure<'a>
where C: Fn(f64)->f64 + 'a {
HasAClosure {
closure: Box::new(f)
}
}
}
I recall being a little puzzled at the lifetime bound, until I fully internalized the fact that closures are structs, and structs that borrow have lifetimes.
The borrow checker is going to be particularly strict about closures that borrow
mutably (FnMut
) since there may be only one mutable reference to a value, and
that reference will be captured by the closure.
let mut x = 1.0;
let mut change_x = || x = 2.0;
change_x();
println!("{}",x);
....
25 | let mut change_x = || x = 2.0;
| -- - previous borrow occurs due to use of `x` in closure
| |
| mutable borrow occurs here
26 | change_x();
27 | println!("{}",x);
| ^ immutable borrow occurs here
The closure bound to change_x
has got its sticky paws on x
and will not release
it until it goes out of scope! So this will work:
let mut x = 1.0;
{
let mut change_x = || x = 2.0;
change_x();
}
println!("{}",x);
This is annoying, but it truly becomes a bastard when you have some struct keeping actions as closures. Only one of those actions can manipulate the environment! Working with Rust is sometimes like the old joke about the person who goes to the doctor: “It hurts when I do this”. To which the doctor replies, “Then don’t do it”. In the case of actions that need to manipulate some state, make the owner of the actions keep the state and explicitly pass a mutable reference to it in the actions. That is, the closures themselves do not borrow mutably.
The problem goes beyond mutable references, particularly when the lifetime of the closures has to be longer than any temporary scope. For instance, you may attach an action to a button in some function - that action must last longer than the function.
move
closures avoid borrow-checking problems by avoiding borrowing - they move
the values. If those values are Copy
, then Rust will copy. But otherwise the value
will be moved and not be available afterwards. This is the only way to get
a closure with a 'static
lifetime.
A common strategy is to use shared references like
Rc
and Arc
(equivalent to C++’s shared_ptr
). Cloning a shared reference just
increments the reference count, and dropping them decrements the count; when the count
goes to zero the actual value is dropped. This is a kind of garbage collection and
provides the important guarantee that the references will last ‘just long enough’. So typically
you would clone a reference and move it into a closure, and avoid explicit lifetime
problems.
Threads
There is an intimate relationship between threads and closures - std::thread::spawn
is
passed a closure and runs in a new thread. In this case, Rust insists that the closure has
a static lifetime, and usually we move
the closure.
This isn’t an inherent feature of threads though! Consider scoped threads from the crossbeam crate:
let greeting = "hello".to_string();
crossbeam::scope(|scope| {
scope.spawn(|| {
println!("{} thread!", greeting);
});
});
The spawned thread is guaranteed to be terminated by the end of the scope, and it is
quite happy with the lifetime of greeting
, which lasts longer than the scope.
It gets better:
let mut greeting = "hello".to_string();
crossbeam::scope(|scope| {
scope.spawn(|| {
greeting.push_str(" eh");
println!("{} thread!", greeting);
});
});
There’s no problem passing a mutable reference either. But, if you tried to mutate
the greeting in another thread within the scope you would get the usual “only one mutable
reference allowed” error. For sharing between threads you will still need Arc
.
However, when declaring closures that need to run in another thread, you need to reassure Rust that it is safe to do so.
let mut greeting: &((Fn()->&'static str) = &(|| "hello");
crossbeam::scope(|scope| {
scope.spawn(|| {
let res = greeting();
println!("{} thread!", res);
});
});
Here greeting
is a reference to any closure of no arguments that returns static strings.
The error is instructive:
27 | scope.spawn(|| {
| ^^^^^ `Fn() -> &str` cannot be shared between threads safely
|
= help: within `&Fn() -> &str`, the trait `Sync` is not implemented for `Fn() -> &str`
And the problem goes away if we explicitly add this marker trait to the declaration:
let mut greeting: &((Fn()->&'static str + Sync) = &(|| "hello");
Higher-Order Functions
These are less elegant than with Lua. Explicit type annotations are a necessary part of life (if you want to remain sane.)
In the Bad Old Days, explicit boxing was your only option if you wanted to return a closure:
fn old_adder(a: f64) -> Box<Fn(f64)->f64> {
Box::new(move |x| a + x)
}
let a1 = old_adder(1.0);
assert_eq!(a1(2.0), 3.0);
Boxing is not always a bad choice, but it does make certain optimizations
impossible - like inlining the closure completely. (Note that the returned closure has
to be move
, since any reference to the argument a
is not going
to last longer than the body of old_adder
. For a primitive like f64
,
moving means copying.)
With Rust 1.26, you can say:
fn new_adder(a: f64) -> impl Fn(f64)->f64 {
move |x| a + x
}
That is, new_adder
returns a particular type, but all the caller needs know
is that it implements Fn(f64)->f64
. In this case, even the callee doesn’t
know the exact type! Cleaner, and no boxing involved.
impl Trait
certainly helps with generic function signatures, although they
can still get a little unwieldy:
fn compose (f1: impl Fn(f64)->f64, f2: impl Fn(f64)->f64) -> impl Fn(f64)->f64 {
move |x| f1(f2(x))
}
let f = compose(f64::sin, |x| x*x);
...
Note that plain old functions match the same trait as closures.
These are already generic functions, since they are passed any type
that implements Fn(f64)->f64
; for each unique type, a new implementation
is created - so-called monomorphism (the expression has the same shape,
but the generated code is different in each case.)
We can get even more generic, and define compose
for any function
that maps a value to another value of that same type:
fn compose <T>(f1: impl Fn(T)->T, f2: impl Fn(T)->T) -> impl Fn(T)->T {
move |x| f1(f2(x))
}
let g = compose(str::trim_left, |s| &s[0..2]);
assert_eq!(g(" hello "),"h");
....
Type inference is a marvelous thing - compose
knows from the first argument that the
type parameter T
must be &str
in this case.
An entertaining property of Rust generic functions is that the declaration is often more scary than the implementation. C++ cheats with templates, since they are effectively compile-time duck-typing. Easier to type, certainly, but the error messages can be less than helpful.
Summary: the Hardness is (Mostly) Necessary
There’s always going to be some trade-off between programmer convenience and performance. Rust would certainly be easier if it boxed closures and used ‘smart pointer’ references by default, but it would not be Rust - in fact, it would be similar to Swift. Instead, Rust aims for the same niche occupied by C, C++ and Ada - predictable high performance and optimal use of system resources (a “systems language”). The novelty it brings to that party is explicitly enforcing lifetime analysis on references.
Closures make functional style possible, and Rust’s implementation makes
this efficient as possible. There is a lot less explicit looping needed,
and you (generally) don’t have to worry about the performance implications
of avoiding loops. Here Rust is following the ‘generic programming’
tradition of C++, where you don’t loop if the operation is already
provided by <algorithm>
- the main difference is that Iterator
takes the place of iterator ranges and the operations are methods.
Closures are synthesized structs which either borrow or move/copy their environment, very much like C++ lambdas. The restrictions we’ve discussed follow directly from this implementation and its interaction with the borrow checker.