Blocking code is a leaky abstraction

John Nunley · October 19, 2024

Asynchronous code does not require the rest of your code to be asynchronous. I can’t say the same for blocking code.

Disclaimer: I am one of the maintainers for smol, a small and fast async runtime for Rust.

I’ve been involved in the Rust community for four years at this point. At this point, I’ve seen a lot of criticism of async. I’ve found it to be an elegant model for programming that easily outclasses alternatives. I use it frequently in my own programs when it fits. There are a lot of programs that would be improved with the presence of async, that don’t use it because people are scared of it. In fact, many organizations have a “hard ban” on async code.

Some of this criticism is valid. async code is a little hard to wrap your head around, but that’s true with many other concepts in Rust, like the borrow checker and the weekly blood sacrifices. Many popular async libraries are explicitly tied to heavyweight crates like tokio and futures, which aren’t good picks for many types of programs. There are also a lot of language features that need to be released for async to be used without an annoying amount of Boxing dynamic objects.

There’s one point, though, that I’ve heard quite frequently at this point. I think it’s misleading. Let’s talk about it.

What’s in a leak?

I’ve seen a lot of people say that async is a “leaky abstraction”. What this means is that the presence of async in a program forces you to bend the program’s control flow to accommodate it. If you have 100 files in your program, and one of those files uses async, you have to either write the entire program in async or resort to bohemian, mind-bending hacks to contain it. Just like an inlaw moving into your spare bedroom.

I do not mean memory leaks, which is what happens if you fail to free memory that you allocate. Neither async nor blocking code has a problem with memory leaking intrinsically.

Dependency Dog Dependency Dog: If you want to see a good example of a leaky abstraction, consider AppKit. Not only is AppKit thread-unsafe to the point where many functions can only safely be called on the main() thread, it forces you into Apple’s terrifying Objective-C model. Basically any program that wants to have working GUI on macOS now needs to interface in Apple’s way, with basically no alternatives.

I’ve seen the “What Color is Your Function?” blogpost by Bob Nystrom referenced a lot in these discussions. This blogpost was originally written with JavaScript’s callbacks in mind. Fair enough. The callback model is hard to deal with, and its enduring popularity in the Rust ecosystem is something I have to write a blogpost about. He also mentions async/await as a potential solution to this problem, although one that he is unsatisfied with, as it still divides the ecosystem into asynchronous and synchronous halves.

While this blogpost may be correct when it comes to JavaScript and other, higher-level languages, I believe that Rust stands out in such a way that it’s not true for this language. In fact, I believe the opposite is true. Non-async code (or “blocking” code) is the real leaky abstraction.

Object Class: Safe

I’d like to discuss how you call blocking code from async code, and vice versa. That way we can compare.

Let’s make a table to describe how it goes calling functions from one “color” to another. You can call blocking code from blocking code without any issues. You can also call asynchronous code from asynchronous code trivially. There is also a strategy for calling asynchronous code from blocking code that I will go into shortly. So our table looks like this:

→ calls ↓ code async blocking
async Trivial Generally Easy
blocking We’ll see… Trivial

Note that not all code fits cleanly into the async/blocking categories. A notorious example is GUI code, which uses blocking semantics but overall acts a lot like async code in that it’s not allowed to block. But that’s a topic for another post.

When you write an async function, it returns a Future, which represents a value that will eventually be resolved. There are a lot of things you can do with a Future. You can race it against another Future, spawn it on an executor, and any number of other operations. It’s a point I delve deeper into in this post.

However, one of the simpler operations is to just wait for a Future to complete. Often, the waiting is done by blocking the current thread. So by “blocking on” the Future, we can effectively turn an async function into a synchronous call.

async fn my_async_code() { /* ... */ }

fn my_main_blocking_code() {
    use futures_lite::future::block_on;
    block_on(my_async_code());
}

block_on takes any Future, whether it’s !Send or not 'static or if it’s about to explode. So literally any async function can be called from synchronous code.

It’s relatively simple, too. block_on is implemented like this:

pub fn block_on<T>(future: impl Future<Output = T>) -> T{
    // A `Context` with a `Waker` is needed to poll a `Future`.
    let waker = waker_that_blocks_current_thread();
    let mut context = Context::from_waker(&waker);

    std::pin::pin!(future); // This used to require `unsafe` code, but doesn't anymore!

    // Poll the future in a loop, blocking the thread while we wait.
    loop {
        match future.as_mut().poll(&mut context) {
            Poll::Ready(value) => return value,
            Poll::Pending => block_thread_until_waker_wakes_us(),
        }
    }
}

Dependency Dog Dependency Dog: The actual block_on is a little more complicated. It has some logic to reuse the waker between function calls, to reduce the overhead to one thread-local key access and nothing else.

Okay, but what if you don’t want futures_lite in your dependency tree? futures_lite isn’t the heaviest dependency on the block (that’s futures), but it’s still a non-negligible amount of code. No need to worry! There’s also pollster, which has zero (required) dependencies and consists of less than 100 lines of code.

fn my_main_blocking_code() {
    use pollster::block_on;
    block_on(my_async_code());
}

So, calling async code from blocking code is easy. Just call block_on. It’s that simple!

It’s not that simple

Of course it’s not that simple. I’m sure people familiar with actually calling async code from blocking code are screaming at the screen right now. So let’s address that.

There are a substantial number of async crates out there that run on top of tokio. They use tokio’s primitives, tokio’s executor, and tokio’s I/O semantics. Because of this, they rely on tokio’s runtime to be running in the background. If you try the above strategy for a crate that relies on tokio, it will fail at runtime with a panic.

No need to fear. We can start a tokio runtime and let it peacefully run in the background, forever. The libraries are able to pick up on this runtime and use it.

In main(), during your program initialization, put this:

use std::{future, thread};

fn main() {
    // Create a runtime.
    let rt = tokio::runtime::Builder::new_current_thread()
        .enable_all()
        .build()
        .unwrap();

    // Clone a handle to the runtime and send it to another thread.
    thread::spawn({
        let handle = rt.handle().clone();

        // Run the handle on this thread, forever.
        move || handle.block_on(future::pending::<()>())
    });

    // "Enter" the runtime and let it sit there.
    let _guard = rt.enter();

    // Block on any futures.
    pollster::block_on(my_async_function());
}

For any block_on calls in your application, the runtime will already be available. Note that you will need to call enter() on any new threads that use tokio primitives. Thankfully you can get a Handle to the runtime, which can be sent to any thread and is also cheaply clonable.

But that’s really it. Once you have the runtime humming away in the background, tokio futures should Just Work!

As an aside, another hitch is that block_on and functions like it are only available on std-enabled platforms. But the no_std async story is a blogpost for another day.

A Quick Segue into tokio::main

I’ve seen some people recommend using the tokio::main attribute to turn an async function into a blocking function, then calling that from your real code. For example:

#[tokio::main(flavor = "current_thread")]
async fn my_async_code() { /* ... */ }

fn main() {
    // `tokio::main` transparently converts `my_async_code` into a blocking function.
    my_async_code();
}

It’s a little impressive, if not a little hacky. The async function is turned into a blocking function using the proc macro.

But… just don’t do this. It means that, every time my_async_code is called, it spins up a tokio runtime, runs the code, then immediately throws that runtime away. For functions that are called a lot, it really adds up. In addition it makes the function signature misleading. It’s a blocking function, not an async function!

Meanwhile, for blocking code…

First off, I find async code to be more predictable than blocking code, in a weird way. Look at this function signature:

async fn my_async_function() { /* ... */ }

What does this tell you? I know that the Future returned by this function won’t block. I can place it in my executor of choice, or race it against any other Futures, without worrying that it will hog the execution loop. By convention poll() will probably run in a time period close to “instant”, before yielding and then letting something else take over.

Yes, there are buggy Futures out there. But well-formed Futures complete quickly.

Now look at this function signature:

fn my_function() { /* ... */ }

By looking at this function signature, can you tell how long it will take to run? Maybe it will complete instantly. Maybe it reads from a file and can potentially take between a few microseconds to a few whole seconds, depending on the file system. Maybe it blocks on a network socket. Maybe it processes a bunch of data in a loop, meaning that for large datasets it could run for a long time.

Yes, you can check the docs. But the docs usually fail to mention any of the above behavior, even for functions in the standard library. All of this ignores behavior dependent on generics/traits, too. It doesn’t matter how well-formed it is, you can’t tell how this function will act.

Often, when writing async programs, I have to be extra sure when I use blocking functions that I’m not accidentally blocking, which would lock up the entire event loop. In most cases this requires me to read the entire code of the function to understand what can go wrong.

If I can’t be sure it won’t block, I’ll need to wrap it in a Future that runs it on its own isolated thread. smol provides the blocking threadpool to run code on other threads, while tokio has a spawn_blocking function.

use blocking::unblock;

fn my_blocking_function() { /* ... */ }

async fn my_async_main() {
    unblock(|| my_blocking_function()).await;
}

This method comes at a cost. At the very least it’s an allocation for the blocking task’s state, as well as a few atomic operations to push it and then pop it from some thread pool’s task queue. At worst it spawns an entire new thread. Compare this to the cost of block_on which is usually one thread-local access.

But wait! unblock will send the function to another thread to be run. So, the function needs to be Send and 'static. This strategy doesn’t even work if the function relies on some kind of thread-unsafe state, like a RefCell. If the function takes a reference to some data you may need to wrap it in an Arc<Mutex>.

use blocking::unblock;
use std::sync::{Arc, Mutex};

fn my_blocking_function(data: &mut Foo) { /* ... */ }

async fn my_async_main() {
    let data = Arc::new(Mutex::new(/* ... */));
    unblock({
        let data = data.clone();
        move || my_blocking_function(&mut data.lock().unwrap())
    }).await;
}

I know this is a common complain with tokio’s style of async/await, but it’s just as bad the other way as well.

For the record, you can call block_on with any kind of borrowed data, with no hassles.

async fn my_async_code(foo: &mut Foo) { /* ... */ }

fn my_main_blocking_code() {
    use futures_lite::future::block_on;

    let mut data = /* ... */;
    block_on(my_async_code(&mut data)); // This works!
}

In order to avoid these issues I often have to segment out code that might block into their own sections. This lets me avoid the overhead of unblock for each function as a bonus.

fn some_blocking_segment(mut data: Foo) {
    do_something(&mut data);
    data.postprocess();
    print_the_data(&data);
}

async fn my_async_main() {
    // This doesn't work if `Foo` is `!Send`.
    let data = /* ... */;
    unblock(
        move || my_blocking_function(&mut data.lock().unwrap())
    }).await;
}

However this requires me to re-architect parts of my code into these segments. It gets difficult to interweave further async code into this sub-section as well. Yes, I can call async functions from block_on, but I’d really prefer to .await on it.

Say, doesn’t this seem very… leaky, to you?

Let’s Fix This

I don’t like to bring up a problem without also mentioning a possible solution. I mentioned documentation above; it would be nice if there was some kind of indicator that a function blocked.

/// Does a thing.
/// 
/// # Blocking
/// 
/// This function will block the first time it is called, as it is reading from
/// `/dev/random` to seed the random number generator.
fn my_blocking_function(data: &mut Foo) { /* ... */ }

It would be a Herculean effort, and I don’t think it’s a sustainable approach. If you’re writing a higher level library, it would be a lot to ask to check if your dependency’s dependency’s dependency maybe reads from a socket.

From a language standpoint, it would be nice if there was some kind of #[blocking] attribute to indicate that a function blocked, like so:

#[blocking]
fn my_blocking_function(data: &mut Foo) { /* ... */ }

Maybe there could even be some kind of tree-traversal to see if you were calling a #[blocking] function from async code, and then raise a warning. Unfortunately I’m unsure if this would work either. There are function that might block once and never again, or functions that only block under specific circumstances that the Rust compiler can’t predict. Not to mention, it would be difficult to solve the problem of data being processed in a tight loop.

So, I don’t know. There are some clever people on the language design team, so maybe they have better ideas.

Parting Shots

Frankly, I don’t think async code is leaky at all, and the ways that it does leak are largely due to library problems. Meanwhile blocking code leaks by its fundamental design. I hope you found this helpful and that it might remove some reservations about using async code in the future.

Twitter, Facebook

This website's source code is hosted via Codeberg