Asynchronous code does not require the rest of your code to be asynchronous. I can’t say the same for blocking code.
Disclaimer: I am one of the maintainers for smol
, a small and fast async
runtime for Rust.
I’ve been involved in the Rust community for four years at this point. At this
point, I’ve seen a lot of criticism of async
. I’ve found it to be an
elegant model for programming that
easily outclasses alternatives. I use it
frequently in my own programs when it fits. There are a lot of programs that
would be improved with the presence of async
, that don’t use it because
people are scared of it. In fact, many organizations have a “hard ban” on
async
code.
Some of this criticism is valid. async
code is a little hard to wrap your head around,
but that’s true with many other concepts in Rust, like the borrow checker and
the weekly blood sacrifices. Many popular async
libraries are explicitly tied to heavyweight
crates like tokio
and futures
, which aren’t good picks for many types of
programs. There are also a lot of language features that need to be released for
async
to be used without an annoying amount of Box
ing dyn
amic objects.
There’s one point, though, that I’ve heard quite frequently at this point. I think it’s misleading. Let’s talk about it.
What’s in a leak?
I’ve seen a lot of people say that async
is a “leaky abstraction”. What this means
is that the presence of async
in a program forces you to bend the program’s
control flow to accommodate it. If you have 100 files in your program, and one
of those files uses async
, you have to either write the entire program in
async
or resort to bohemian, mind-bending hacks to contain it. Just like an
inlaw moving into your spare bedroom.
I do not mean memory leaks, which is what happens if you fail to free memory that you
allocate. Neither async
nor blocking code has a problem with memory leaking intrinsically.
Dependency Dog: If you want to see a good example of a leaky abstraction, consider AppKit. Not only is AppKit thread-unsafe to the point where many functions can only safely be called on the
main()
thread, it forces you into Apple’s terrifying Objective-C model. Basically any program that wants to have working GUI on macOS now needs to interface in Apple’s way, with basically no alternatives.
I’ve seen the “What Color is Your Function?”
blogpost by Bob Nystrom referenced a lot in these discussions. This blogpost was
originally written with JavaScript’s callbacks in mind. Fair enough. The callback
model is hard to deal with, and its enduring popularity in the Rust ecosystem is something
I have to write a blogpost about. He also mentions async
/await
as a potential
solution to this problem, although one that he is unsatisfied with, as it still
divides the ecosystem into asynchronous and synchronous halves.
While this blogpost may be correct when it comes to JavaScript and other, higher-level
languages, I believe that Rust stands out in such a way that it’s not true for this language.
In fact, I believe the opposite is true. Non-async
code (or “blocking” code)
is the real leaky abstraction.
Object Class: Safe
I’d like to discuss how you call blocking code from async
code, and vice versa.
That way we can compare.
Let’s make a table to describe how it goes calling functions from one “color” to another. You can call blocking code from blocking code without any issues. You can also call asynchronous code from asynchronous code trivially. There is also a strategy for calling asynchronous code from blocking code that I will go into shortly. So our table looks like this:
→ calls ↓ code | async |
blocking |
---|---|---|
async |
Trivial | Generally Easy |
blocking | We’ll see… | Trivial |
Note that not all code fits cleanly into the async
/blocking categories. A notorious
example is GUI code, which uses blocking semantics but overall acts a lot like async
code in that it’s not allowed to block. But that’s a topic for another post.
When you write an async
function, it returns a Future
, which represents a
value that will eventually be resolved. There are a lot of things you can do with
a Future
. You can race it against another Future
, spawn it on an executor,
and any number of other operations. It’s a point I delve deeper into in
this post.
However, one of the simpler operations is to just wait for a Future
to
complete. Often, the waiting is done by blocking the current thread. So by
“blocking on” the Future
, we can effectively turn an async
function into
a synchronous call.
async fn my_async_code() { /* ... */ }
fn my_main_blocking_code() {
use futures_lite::future::block_on;
block_on(my_async_code());
}
block_on
takes any Future
, whether it’s !Send
or not 'static
or if
it’s about to explode. So literally any async
function can be called from
synchronous code.
It’s relatively simple, too. block_on
is implemented like this:
pub fn block_on<T>(future: impl Future<Output = T>) -> T{
// A `Context` with a `Waker` is needed to poll a `Future`.
let waker = waker_that_blocks_current_thread();
let mut context = Context::from_waker(&waker);
std::pin::pin!(future); // This used to require `unsafe` code, but doesn't anymore!
// Poll the future in a loop, blocking the thread while we wait.
loop {
match future.as_mut().poll(&mut context) {
Poll::Ready(value) => return value,
Poll::Pending => block_thread_until_waker_wakes_us(),
}
}
}
Dependency Dog: The actual
block_on
is a little more complicated. It has some logic to reuse thewaker
between function calls, to reduce the overhead to one thread-local key access and nothing else.
Okay, but what if you don’t want futures_lite
in your dependency tree? futures_lite
isn’t the heaviest dependency on the block (that’s futures
), but it’s still a
non-negligible amount of code. No need to worry! There’s also pollster
, which
has zero (required) dependencies and consists of less than 100 lines of code.
fn my_main_blocking_code() {
use pollster::block_on;
block_on(my_async_code());
}
So, calling async
code from blocking code is easy. Just call block_on
. It’s that simple!
It’s not that simple
Of course it’s not that simple. I’m sure people familiar with actually calling
async
code from blocking code are screaming at the screen right now. So let’s
address that.
There are a substantial number of async
crates out there that run on top of
tokio
. They use tokio
’s primitives, tokio
’s executor, and
tokio
’s I/O semantics. Because of this, they rely on tokio
’s runtime to
be running in the background. If you try the above strategy for a crate that
relies on tokio
, it will fail at runtime with a panic.
No need to fear. We can start a tokio
runtime and let it
peacefully run in the background, forever. The libraries are able to pick up on
this runtime and use it.
In main()
, during your program initialization, put this:
use std::{future, thread};
fn main() {
// Create a runtime.
let rt = tokio::runtime::Builder::new_current_thread()
.enable_all()
.build()
.unwrap();
// Clone a handle to the runtime and send it to another thread.
thread::spawn({
let handle = rt.handle().clone();
// Run the handle on this thread, forever.
move || handle.block_on(future::pending::<()>())
});
// "Enter" the runtime and let it sit there.
let _guard = rt.enter();
// Block on any futures.
pollster::block_on(my_async_function());
}
For any block_on
calls in your application, the runtime will already be
available. Note that you will need to call enter()
on any new threads that use
tokio
primitives. Thankfully you can get a Handle
to the runtime, which
can be sent to any thread and is also cheaply clonable.
But that’s really it. Once you have the runtime humming away in the background,
tokio
futures should Just Work!
As an aside, another hitch is that block_on
and functions like it are only available on std
-enabled
platforms. But the no_std
async
story is a blogpost for another day.
A Quick Segue into tokio::main
I’ve seen some people recommend using the tokio::main
attribute to turn an
async
function into a blocking function, then calling that from your real code.
For example:
#[tokio::main(flavor = "current_thread")]
async fn my_async_code() { /* ... */ }
fn main() {
// `tokio::main` transparently converts `my_async_code` into a blocking function.
my_async_code();
}
It’s a little impressive, if not a little hacky. The async
function is turned into
a blocking function using the proc macro.
But… just don’t do this. It means that, every time my_async_code
is called, it
spins up a tokio
runtime, runs the code, then immediately throws that runtime
away. For functions that are called a lot, it really adds up. In addition it makes
the function signature misleading. It’s a blocking function, not an async
function!
Meanwhile, for blocking code…
First off, I find async
code to be more predictable than blocking code, in a
weird way. Look at this function signature:
async fn my_async_function() { /* ... */ }
What does this tell you? I know that the Future
returned by this function
won’t block. I can place it in my executor of choice, or race it against any
other Future
s, without worrying that it will hog the execution loop. By
convention poll()
will probably run in a time period close to “instant”,
before yielding and then letting something else take over.
Yes, there are buggy Future
s out there. But well-formed Future
s
complete quickly.
Now look at this function signature:
fn my_function() { /* ... */ }
By looking at this function signature, can you tell how long it will take to run? Maybe it will complete instantly. Maybe it reads from a file and can potentially take between a few microseconds to a few whole seconds, depending on the file system. Maybe it blocks on a network socket. Maybe it processes a bunch of data in a loop, meaning that for large datasets it could run for a long time.
Yes, you can check the docs. But the docs usually fail to mention any of the above behavior, even for functions in the standard library. All of this ignores behavior dependent on generics/traits, too. It doesn’t matter how well-formed it is, you can’t tell how this function will act.
Often, when writing async
programs, I have to be extra sure when I use blocking
functions that I’m not accidentally blocking, which would lock up the entire event
loop. In most cases this requires me to read the entire code of the function
to understand what can go wrong.
If I can’t be sure it won’t block, I’ll need to wrap it in a Future
that runs
it on its own isolated thread. smol
provides the blocking
threadpool to
run code on other threads, while tokio
has a spawn_blocking
function.
use blocking::unblock;
fn my_blocking_function() { /* ... */ }
async fn my_async_main() {
unblock(|| my_blocking_function()).await;
}
This method comes at a cost. At the very least it’s an allocation for the blocking
task’s state, as well as a few atomic operations to push it and then pop it from some thread
pool’s task queue. At worst it spawns an entire new thread. Compare this to the
cost of block_on
which is usually one thread-local access.
But wait! unblock
will send the function to another thread to be run. So,
the function needs to be Send
and 'static
. This strategy doesn’t even work
if the function relies on some kind of thread-unsafe state, like a RefCell
.
If the function takes a reference to some data you may need to wrap it in an
Arc
<Mutex
>.
use blocking::unblock;
use std::sync::{Arc, Mutex};
fn my_blocking_function(data: &mut Foo) { /* ... */ }
async fn my_async_main() {
let data = Arc::new(Mutex::new(/* ... */));
unblock({
let data = data.clone();
move || my_blocking_function(&mut data.lock().unwrap())
}).await;
}
I know this is a common complain with tokio
’s style of async
/await
, but
it’s just as bad the other way as well.
For the record, you can call block_on
with any kind of borrowed data, with no
hassles.
async fn my_async_code(foo: &mut Foo) { /* ... */ }
fn my_main_blocking_code() {
use futures_lite::future::block_on;
let mut data = /* ... */;
block_on(my_async_code(&mut data)); // This works!
}
In order to avoid these issues I often have to segment out code that might block
into their own sections. This lets me avoid the overhead of unblock
for each
function as a bonus.
fn some_blocking_segment(mut data: Foo) {
do_something(&mut data);
data.postprocess();
print_the_data(&data);
}
async fn my_async_main() {
// This doesn't work if `Foo` is `!Send`.
let data = /* ... */;
unblock(
move || my_blocking_function(&mut data.lock().unwrap())
}).await;
}
However this requires me to re-architect parts of my code into these segments. It gets
difficult to interweave further async
code into this sub-section as well. Yes,
I can call async
functions from block_on
, but I’d really prefer to .await
on it.
Say, doesn’t this seem very… leaky, to you?
Let’s Fix This
I don’t like to bring up a problem without also mentioning a possible solution. I mentioned documentation above; it would be nice if there was some kind of indicator that a function blocked.
/// Does a thing.
///
/// # Blocking
///
/// This function will block the first time it is called, as it is reading from
/// `/dev/random` to seed the random number generator.
fn my_blocking_function(data: &mut Foo) { /* ... */ }
It would be a Herculean effort, and I don’t think it’s a sustainable approach. If you’re writing a higher level library, it would be a lot to ask to check if your dependency’s dependency’s dependency maybe reads from a socket.
From a language standpoint, it would be nice if there was some kind of #[blocking]
attribute to indicate that a function blocked, like so:
#[blocking]
fn my_blocking_function(data: &mut Foo) { /* ... */ }
Maybe there could even be some kind of tree-traversal to see if you were calling a
#[blocking]
function from async
code, and then raise a warning. Unfortunately I’m
unsure if this would work either. There are function that might block once and never again,
or functions that only block under specific circumstances that the Rust compiler
can’t predict. Not to mention, it would be difficult to solve the problem of data
being processed in a tight loop.
So, I don’t know. There are some clever people on the language design team, so maybe they have better ideas.
Parting Shots
Frankly, I don’t think async
code is leaky at all, and the ways that it does leak
are largely due to library problems. Meanwhile blocking code leaks by its fundamental design.
I hope you found this helpful and that it might remove some reservations about using
async
code in the future.