<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="http://notgull.net/feed.xml" rel="self" type="application/atom+xml" /><link href="http://notgull.net/" rel="alternate" type="text/html" /><updated>2026-03-01T19:28:03+00:00</updated><id>http://notgull.net/feed.xml</id><title type="html">notgull</title><subtitle>The world&apos;s number one source of notgull</subtitle><author><name>John Nunley</name></author><entry><title type="html">About Bootstrapping, and why it’s important</title><link href="http://notgull.net/bootstrapping/" rel="alternate" type="text/html" title="About Bootstrapping, and why it’s important" /><published>2026-01-04T00:00:00+00:00</published><updated>2026-01-04T00:00:00+00:00</updated><id>http://notgull.net/bootstrapping</id><content type="html" xml:base="http://notgull.net/bootstrapping/"><![CDATA[<p>If you’re like me, you’ve probably asked “where does my compiler come from?”</p>

<p>The chain starts as follows: you have a program that’s written in some
programming language. If it’s a compiled language (like C++ or Rust), you’ll
need to compile it in order to run it.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">cat</span> <span class="o">&gt;</span> program.c <span class="o">&lt;&lt;</span><span class="no">EOF</span><span class="sh">
#include &lt;stdio.h&gt;

int main() {
  puts("Hello, world!");
  return 0;
}
</span><span class="no">EOF
</span><span class="nv">$ </span>cc program.c <span class="nt">-o</span> program
<span class="nv">$ </span>./program
Hello, world!
</code></pre></div></div>

<p>If it’s an interpreted language (like Python or JavaScript), there’s no need to
compile it. Just pass the program to the interpreter.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>python3 <span class="nt">-c</span> <span class="s2">"print('Hello, world!')"</span>
Hello, world!
</code></pre></div></div>

<p>The interpreter is usually a compiled program, however. So the actual chain of
events, amortized across multiple parties, can be simplified to:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="c"># At your distribution...</span>
<span class="nv">$ </span>cc python3.c <span class="nt">-o</span> python3
<span class="nv">$ </span>upload_to_server ./python3

<span class="nv">$ </span><span class="c"># At your computer...</span>
<span class="nv">$ </span>apt <span class="nb">install </span>python3
<span class="nv">$ </span>python3 <span class="nt">-c</span> <span class="s2">"print('Hello, world!')"</span>
Hello, world!
</code></pre></div></div>

<p>No matter what we do, at some point a compiler has to compile some code.</p>

<blockquote>
  <p>Of course, this is glossing over a lot that can happen between these two
extremes of “interpreted” and “compiled” languages. Some interpreters are
optimized by turning them into really fast compilers.</p>
</blockquote>

<p>However, the compiler is itself a program that’s written in some programming
language. Therefore, at the start, you need a compiler to compile that compiler.
But then <em>that</em> compiler itself needs a compiler. Then <em>that</em> compiler. So on
and so forth.</p>

<p>This is the <a href="https://en.wikipedia.org/wiki/Bootstrapping_(compilers)"><strong>bootstrap chain</strong></a>; the collection of software that is required to
build your software. Usually the bootstrap chain ends at a certain trusted root,
like the Debian software repositories. For most organizational contexts, this
is as far as your need to go.</p>

<p>But, any sufficiently good programmer will eventually get a little more
curious. They want to pull on the bootstrap chain a little further than that.
The majority of compilers are bootstrapped; they’re written in their own
language and can be used to compile themselves.</p>

<p>The short version of the story is that the previous version of a compiler is
used to compile the next version of a compiler. GCC 14 is used to compile
GCC 15. GCC 13 is used to compile GCC 14. Rinse and repeat, all the way back
to GCC 1 being used to compile GCC 2.</p>

<p>I can’t really find any history on how the first versions of GCC were compiled<sup id="fnref:1" role="doc-noteref"><a href="#fn:1" class="footnote" rel="footnote">1</a></sup>,
but I imagine they were compiled using the simpler C compilers of the time. If
you were to follow those compilers back, you’d eventually get ones that were
implemented in assembly language. Meaning, you’d need an assembler to assemble
them.</p>

<p>Then, who assembles the assembler? Back in the 70’s, when this code was first
being written, it was common for programmers to just not use assemblers. They
would write out the intended assembly code, “assemble” it into the binary code
by hand, then load that binary code into the computer. <a href="https://web.archive.org/web/20170329155250/https://www.computer.org/web/awards/pioneer-david-wheeler">David Wheeler</a> gets
credit for inventing the first assembler, which would convert textual
assembly code into binary.</p>

<p>For most people, that’s where the story ends. Developers toggled bytes into
early computers to get assemblers, early assemblers built early compilers,
then early compilers built the modern compilers and interpreters we enjoy
today.</p>

<p>For us, it’s just the beginning.</p>

<h2 id="ken-thompson-and-trusting-trust">Ken Thompson and Trusting Trust</h2>

<p>You may notice that this entire extended “bootstrap chain” takes place over
almost 80 years of computing progress. Specifically, large parts of the chain
involve programs that have since been lost to time, or compilers that were
always closed source. It makes you wonder: what would happen if someone in
that chain was actively malicious? What if someone in the chain wanted to
control every program at every step after?</p>

<p><a href="https://en.wikipedia.org/wiki/Ken_Thompson">Ken Thompson</a> is a living legend.
He created Unix, and along the way also helped create the C programming language
alongside <a href="https://en.wikipedia.org/wiki/Dennis_Ritchie">Dennis Ritchie</a>, as
well as a million other little things like <code class="language-plaintext highlighter-rouge">ed</code> and UTF-8.</p>

<p>For his massive contribution to computer science, he won the
<a href="https://en.wikipedia.org/wiki/Turing_Award">Turing Award</a> in 1983. Upon
receiving the award, he <a href="https://www.cs.cmu.edu/~rdriley/487/papers/Thompson_1984_ReflectionsonTrustingTrust.pdf">demonstrated</a> a possible attack on compiler infrastructure
that could be used to compromise all software, everywhere.</p>

<p>He first demonstrated a fairly typical attack: a compiler that intentionally
mis-compiled a program with a known source input. Effectively, the compiler
changes the behavior of the program it’s compiling. For example, imagine we had
a program that validates some input as part of a login program.</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#include</span> <span class="cpf">&lt;stdio.h&gt;</span><span class="cp">
#include</span> <span class="cpf">&lt;string.h&gt;</span><span class="cp">
</span>
<span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">password</span> <span class="o">=</span> <span class="s">"foobar"</span><span class="p">;</span>

<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
  <span class="kt">char</span> <span class="n">buffer</span><span class="p">[</span><span class="mi">80</span><span class="p">];</span>

  <span class="n">printf</span><span class="p">(</span><span class="s">"Input password: "</span><span class="p">);</span>
  <span class="n">fgets</span><span class="p">(</span><span class="n">buffer</span><span class="p">,</span> <span class="mi">80</span><span class="p">,</span> <span class="n">stdin</span><span class="p">);</span>
  <span class="k">if</span> <span class="p">(</span><span class="n">strcmp</span><span class="p">(</span><span class="n">buffer</span><span class="p">,</span> <span class="n">password</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
    <span class="n">puts</span><span class="p">(</span><span class="s">"Password matches!"</span><span class="p">);</span>
    <span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
  <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
    <span class="n">puts</span><span class="p">(</span><span class="s">"Incorrect password!"</span><span class="p">);</span>
    <span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
  <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Now, you go to compile this program using the system’s C compiler, <code class="language-plaintext highlighter-rouge">cc</code>. But,
let’s say the compiler is written like this:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">program_to_look_for</span> <span class="o">=</span> <span class="s">"int main() { ... }"</span><span class="p">;</span>
<span class="k">const</span> <span class="kt">char</span> <span class="o">*</span><span class="n">program_to_replace_it_with</span> <span class="o">=</span> <span class="s">"int main() { ... }"</span><span class="p">;</span>

<span class="kt">char</span> <span class="o">*</span><span class="n">program_source</span> <span class="o">=</span> <span class="n">read_c_code_from_file</span><span class="p">();</span>
<span class="k">if</span> <span class="p">(</span><span class="n">strcmp</span><span class="p">(</span><span class="n">program_source</span><span class="p">,</span> <span class="n">program_to_look_for</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
  <span class="c1">// Compiler our custom program instead.</span>
  <span class="n">program_source</span> <span class="o">=</span> <span class="n">program_to_replace_it_with</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">do_the_actual_compilation</span><span class="p">(</span><span class="n">program_source</span><span class="p">);</span>
</code></pre></div></div>

<p>For most programs, the compiler will behave normally as intended. But, it’s
looking for your specific login program. If it sees it, it will replace the
source code it <em>should</em> compile with something else. So even if you pass the
compiler the correct source code, it can do whatever it wants with it.</p>

<p>Such a compiler could, for example, insert a backdoor into your login program.</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>good-cc login.c <span class="nt">-o</span> login
<span class="nv">$ </span>./login
Input password: secret_backdoor
Incorrect password!

<span class="nv">$ </span><span class="c"># Now, using our backdoored compiler:</span>
<span class="nv">$ </span>cc login.c <span class="nt">-o</span> login
<span class="nv">$ </span>./login
Input password: secret_backdoor
Password matches!
</code></pre></div></div>

<p>You can replace the “source code matching” code above with any other kind of
heuristic. For example, analyzing the RTL for some kind of security subroutine,
then subtly replacing it with a backdoored version.</p>

<p>This is somewhat disturbing, but can be easily avoided by just using a trusted
compiler. Except, remember, compilers are themselves programs that need to be
compiled by compilers. It’s possible for a compiler to tell when it’s compiling
another compiler, then compile it with the backdoor code in it.</p>

<p>Imagine a backdoor programmed like this:</p>

<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">let</span> <span class="n">program_source</span> <span class="o">=</span> <span class="n">read_program_code</span><span class="p">();</span>
<span class="k">if</span> <span class="p">(</span><span class="n">strcmp</span><span class="p">(</span><span class="n">program_source</span><span class="p">,</span> <span class="n">c_compiler_source_code</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">program_source</span> <span class="o">=</span> <span class="n">backdoored_c_compiler_source_code</span><span class="p">;</span>
<span class="p">}</span> <span class="k">else</span> <span class="nf">if</span> <span class="p">(</span><span class="n">strcmp</span><span class="p">(</span><span class="n">program_source</span><span class="p">,</span> <span class="n">login_source_code</span><span class="p">)</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span> <span class="p">{</span>
  <span class="n">program_source</span> <span class="o">=</span> <span class="n">backdoored_login_source_code</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">do_the_actual_compilation</span><span class="p">(</span><span class="n">program_source</span><span class="p">);</span>
</code></pre></div></div>

<p>Effectively, this is a self-reproducing backdoor; a “virus” of sorts that infects
every subsequent compiler. All a malicious actor has to do is insert this program
early in the bootstrap chain. Then, every subsequent compiler in the bootstrap
chain will have the backdoor as well. These compilers will compile the backdoor
into other compilers, which will then compiler the backdoor in your program.</p>

<h2 id="impact">Impact</h2>

<p>The terrifying thing about this virus is that there is no reliable way to detect
it. There’s no real way to inspect binaries to ensure the bootloader isn’t there.</p>

<p>You could, in theory, hexdump the programs and see if there are any errant
instructions in there. But the hexdumper could be backdoored as well, to silently
not show you the backdoor. Ken Thompson goes to great lengths to indicate that the 
backdoor can effect any program that handles code; whether it’s the assembler,
the dynamic loader, or even the hardware’s microcode.</p>

<p>You could debug the program and check to see exactly what it’s executing. But,
the debugger could be backdoored. It could be silently skipping the instructions
that insert the backdoor.</p>

<p>You might be saying “no worries, we have <a href="https://en.wikipedia.org/wiki/Reproducible_builds">reproducible builds</a>.
We can just compare the program across builds to ensure that the hash is the same.”
Except, this doesn’t work. How are you checking that the program is the same?
<code class="language-plaintext highlighter-rouge">sha256sum</code>? GnuPG? These can be backdoored too.</p>

<p>You would have to manually inspect the hard drive with an electron microscope
to actually verify the exact contents. But even then, this isn’t a bulletproof
defense. Modern CPU’s run microcode to translate the instruction code to
the underlying CPU microarchitecture. That microcode can be compromised and
made to mis-translate certain code.</p>

<p>This isn’t a theoretical attack, either. Viruses like this have already been
<a href="https://web.archive.org/web/20090821051953/http://www.h-online.com/security/Virus-infects-development-environment--/news/114031">discovered in the wild</a>.
Granted, in this case, this virus only infected Delphi environments and was
mostly harmless.</p>

<p>But now, imagine what a nation state could do with this attack. PRISM has
existed for some time now. In fact, there are individual software companies 
(and not that big of ones, either) that could put backdoors into popular open
source software that could be replicated throughout the ecosystem.</p>

<p>This affects all software, closed and open source.</p>

<h2 id="the-solution">The Solution</h2>

<p>The solution to the source of compiled software being in question is to
have a way to create software without relying on other software. Back in
the 70’s, the boot program for a computer was toggled into the front panel
by a series of levers. This program would be the bare minimum needed to
load the rest of the operating system from the disk.</p>

<p>This is still theoretically possible today. For example, it’s possible to
take a floppy disk and a pencil, then make holes in the floppy disk with
the pencil. The floppy disk reader can interpret the holes as binary,
which effectively lets you program with a pencil.</p>

<p>You could use this strategy to write a compiler for a very, very basic
programming language. Since you can trust the compiler you just wrote,
you can trust anything that compiler has output. Then, you use this
language to write a compiler for a slightly more advanced programming
language.</p>

<p>You keep going until you create an assembler, then keep going
after that until you have a C compiler. With a C compiler you can start
compiling old GNU utilities until you arrive at GCC. Once you have GCC,
you can compile anything, including Linux.</p>

<p>This is what the <a href="https://github.com/oriansj/stage0"><code class="language-plaintext highlighter-rouge">stage0</code></a> project has
been doing for some time. There is a hex interpreter that starts with
around 380 bytes of code. This is feasible enough to toggle into a computer,
then verified manually, by hand. Then after a couple of more advanced
hex interpreters, it brings up a macro assembler, which then assembles
a basic C compiler. This then compiles <a href="https://bellard.org/tcc/">TinyCC</a>,
which can compile a basic operating system and userland. This is enough
to get to GCC up, which then brings up Linux.</p>

<p>This is glossing over a lot. The full bootstrap chain is <a href="https://github.com/fosslinux/live-bootstrap/blob/master/parts.rst">here</a>.
I also talk about bootstrappable builds a lot in <a href="https://notgull.net/announcing-dozer/">this post</a>.</p>

<p>Of course, this isn’t bulletproof. <code class="language-plaintext highlighter-rouge">stage0</code> requires a BIOS or UEFI implementation
to run. BIOS and UEFI is written in code; code that can be very easily compromised.
If we’re going this far, who’s the say the microcode or boot ROM on the CPU isn’t
compromised? In fact, how do we know the RTL on the CPU isn’t backdoored? It’s not
like anyone has ever bothered to check.</p>

<h2 id="conclusion">Conclusion</h2>

<p>I wanted to share a little of my computing paranoia in this post. Computers
can never be 100 percent secure. Until someone manages to build a CPU in their
garage out of nothing but electrical wire and duct tape, then bring up Linux
on that CPU, it’s possible for any compiler to be compromised.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:1" role="doc-endnote">
      <p>This <a href="https://old.reddit.com/r/AskProgramming/comments/1k0gs11/comment/mndwbzc">reddit post</a> seems to indicate that the first version of GCC was compiled with some early commercial compiler. But it’s unsources, and it’s not like I’d take the word of a Reddit post anyways. If you happen to have more context on the early history of GCC, please reach out to me. <a href="#fnref:1" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>John Nunley</name></author><category term="rust" /><category term="bootstrapping" /><summary type="html"><![CDATA[If you’re like me, you’ve probably asked “where does my compiler come from?”]]></summary></entry><entry><title type="html">Rust’s Block Pattern</title><link href="http://notgull.net/block-pattern/" rel="alternate" type="text/html" title="Rust’s Block Pattern" /><published>2025-12-18T00:00:00+00:00</published><updated>2025-12-18T00:00:00+00:00</updated><id>http://notgull.net/block-pattern</id><content type="html" xml:base="http://notgull.net/block-pattern/"><![CDATA[<p>Here’s a little idiom that I haven’t really seen discussed anywhere, that I think makes Rust code much cleaner and more robust.</p>

<p>I don’t know if there’s an actual name for this idiom; I’m calling it the “block pattern” for lack of a better word. I find
myself reaching for it frequently in code, and I think other Rust code could become cleaner if it followed this pattern.
If there’s an existing name for this, please let me know!</p>

<p>The pattern comes from blocks in Rust being valid expressions. For example, this code:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">foo</span> <span class="o">=</span> <span class="p">{</span> <span class="mi">1</span> <span class="o">+</span> <span class="mi">2</span> <span class="p">};</span>
</code></pre></div></div>

<p>…is equal to this code:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">foo</span> <span class="o">=</span> <span class="mi">1</span> <span class="o">+</span> <span class="mi">2</span><span class="p">;</span>
</code></pre></div></div>

<p>…which is, in turn, equal to this code:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">foo</span> <span class="o">=</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">x</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
    <span class="k">let</span> <span class="n">y</span> <span class="o">=</span> <span class="mi">2</span><span class="p">;</span>
    <span class="n">x</span> <span class="o">+</span> <span class="n">y</span>
<span class="p">};</span>
</code></pre></div></div>

<h2 id="so-why-does-this-matter">So, why does this matter?</h2>

<p>Let’s say you have a function that loads a configuration file, then sends a few HTTP requests based on that config file.
In order to load that config file, first you need to load the raw bytes of that file from the disk. Then you need to parse
whatever the format of the configuration file is. For the sake of having a complex enough program to demonstrate the value
of this pattern, let’s say it’s JSON with comments. You would need to remove the comments first using the <a href="https://crates.io/crates/regex"><code class="language-plaintext highlighter-rouge">regex</code></a> crate,
then parse the resulting JSON with something like <a href="https://crates.io/crates/serde-json"><code class="language-plaintext highlighter-rouge">serde-json</code></a>.</p>

<p>Such a function would look like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">regex</span><span class="p">::{</span><span class="n">Regex</span><span class="p">,</span> <span class="n">RegexBuilder</span><span class="p">};</span>
<span class="k">use</span> <span class="nn">std</span><span class="p">::{</span><span class="n">fs</span><span class="p">,</span> <span class="nn">sync</span><span class="p">::</span><span class="n">LazyLock</span><span class="p">};</span>

<span class="cd">/// Format of the configuration file.</span>
<span class="nd">#[derive(serde::Deserialize)]</span>
<span class="k">struct</span> <span class="n">Config</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">}</span>

<span class="c1">// Always make sure to cache your regexes!</span>
<span class="k">static</span> <span class="n">STRIP_COMMENTS</span><span class="p">:</span> <span class="n">LazyLock</span><span class="o">&lt;</span><span class="n">Regex</span><span class="o">&gt;</span> <span class="o">=</span> <span class="nn">LazyLock</span><span class="p">::</span><span class="nf">new</span><span class="p">(||</span> <span class="p">{</span>
    <span class="nn">RegexBuilder</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="s">r"//.*"</span><span class="p">)</span><span class="nf">.multi_line</span><span class="p">(</span><span class="k">true</span><span class="p">)</span><span class="nf">.build</span><span class="p">()</span><span class="nf">.expect</span><span class="p">(</span><span class="s">"regex build failed"</span><span class="p">)</span>
<span class="p">});</span>

<span class="cd">/// Function to load the config and send some HTTP requests.</span>
<span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">cfg_file</span><span class="p">:</span> <span class="o">&amp;</span><span class="nb">str</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nn">anyhow</span><span class="p">::</span><span class="nb">Result</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="c1">// Load the raw bytes of the file.</span>
    <span class="k">let</span> <span class="n">config_data</span> <span class="o">=</span> <span class="nn">fs</span><span class="p">::</span><span class="nf">read</span><span class="p">(</span><span class="n">cfg_file</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>

    <span class="c1">// Convert to a string to the regex can work on it.</span>
    <span class="k">let</span> <span class="n">config_string</span> <span class="o">=</span> <span class="nn">String</span><span class="p">::</span><span class="nf">from_utf8</span><span class="p">(</span><span class="o">&amp;</span><span class="n">config_data</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>

    <span class="c1">// Strip out all comments.</span>
    <span class="k">let</span> <span class="n">stripped_data</span> <span class="o">=</span> <span class="n">STRIP_COMMENTS</span><span class="nf">.replace</span><span class="p">(</span><span class="o">&amp;</span><span class="n">config_string</span><span class="p">,</span> <span class="s">""</span><span class="p">);</span>

    <span class="c1">// Parse as JSON.</span>
    <span class="k">let</span> <span class="n">config</span> <span class="o">=</span> <span class="nn">serde_json</span><span class="p">::</span><span class="nf">from_str</span><span class="p">(</span><span class="o">&amp;</span><span class="n">stripped_data</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>

    <span class="c1">// Do some work based on this data.</span>
    <span class="nf">send_http_request</span><span class="p">(</span><span class="o">&amp;</span><span class="n">config</span><span class="py">.url1</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>
    <span class="nf">send_http_request</span><span class="p">(</span><span class="o">&amp;</span><span class="n">config</span><span class="py">.url2</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>
    <span class="nf">send_http_request</span><span class="p">(</span><span class="o">&amp;</span><span class="n">config</span><span class="py">.url3</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>

    <span class="nf">Ok</span><span class="p">(())</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This is fairly simple, and just leverages a few Rust crates and language features to parse JSON and then do something with it.</p>

<p>However, there are a few weaknesses here. In the <code class="language-plaintext highlighter-rouge">foo</code> function, we declare four new variables (<code class="language-plaintext highlighter-rouge">config_data</code>, <code class="language-plaintext highlighter-rouge">config_string</code>,
<code class="language-plaintext highlighter-rouge">stripped_data</code>, <code class="language-plaintext highlighter-rouge">config</code>) only for only one of those variables to be used after the configuration parsing (<code class="language-plaintext highlighter-rouge">config</code>). In addition,
let’s say you didn’t know what this code was for going in, and you didn’t have these comments (or you had bad comments). One might
ask why you’re declaring the regular expression <code class="language-plaintext highlighter-rouge">STRIP_COMMENTS</code>, or why you’re loading data from a file.</p>

<p>When I write code, I try to make it immediately obvious what the purpose of the code is, and why it’s written that way. This is why
I generally avoid C’s “bottom-up” strategy for organizing code. It’s like being given a few screws and being expected to implicitly
understand that it should be built into a chair. In Rust, I like that you are able to define your top-level functions first, and then
go down and define all the bits and pieces after.</p>

<p>Although, we can do a little bit better. What if we organized the <code class="language-plaintext highlighter-rouge">foo</code> function like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cd">/// Function to load the config and send some HTTP requests.</span>
<span class="k">fn</span> <span class="nf">foo</span><span class="p">(</span><span class="n">cfg_file</span><span class="p">:</span> <span class="o">&amp;</span><span class="nb">str</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nn">anyhow</span><span class="p">::</span><span class="nb">Result</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="c1">// Load the configuration from the file.</span>
    <span class="k">let</span> <span class="n">config</span> <span class="o">=</span> <span class="p">{</span>
        <span class="c1">// Cached regular expression for stripping comments.</span>
        <span class="k">static</span> <span class="n">STRIP_COMMENTS</span><span class="p">:</span> <span class="n">LazyLock</span><span class="o">&lt;</span><span class="n">Regex</span><span class="o">&gt;</span> <span class="o">=</span> <span class="nn">LazyLock</span><span class="p">::</span><span class="nf">new</span><span class="p">(||</span> <span class="p">{</span>
            <span class="nn">RegexBuilder</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="s">r"//.*"</span><span class="p">)</span><span class="nf">.multi_line</span><span class="p">(</span><span class="k">true</span><span class="p">)</span><span class="nf">.build</span><span class="p">()</span><span class="nf">.expect</span><span class="p">(</span><span class="s">"regex build failed"</span><span class="p">)</span>
        <span class="p">});</span>

        <span class="c1">// Load the raw bytes of the file.</span>
        <span class="k">let</span> <span class="n">raw_data</span> <span class="o">=</span> <span class="nn">fs</span><span class="p">::</span><span class="nf">read</span><span class="p">(</span><span class="n">cfg_file</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>

        <span class="c1">// Convert to a string to the regex can work on it.</span>
        <span class="k">let</span> <span class="n">data_string</span> <span class="o">=</span> <span class="nn">String</span><span class="p">::</span><span class="nf">from_utf8</span><span class="p">(</span><span class="o">&amp;</span><span class="n">raw_data</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>

        <span class="c1">// Strip out all comments.</span>
        <span class="k">let</span> <span class="n">stripped_data</span> <span class="o">=</span> <span class="n">STRIP_COMMENTS</span><span class="nf">.replace</span><span class="p">(</span><span class="o">&amp;</span><span class="n">config_string</span><span class="p">,</span> <span class="s">""</span><span class="p">);</span>

        <span class="c1">// Parse as JSON.</span>
        <span class="nn">serde_json</span><span class="p">::</span><span class="nf">from_str</span><span class="p">(</span><span class="o">&amp;</span><span class="n">stripped_data</span><span class="p">)</span><span class="o">?</span>
    <span class="p">};</span>

    <span class="c1">// Do some work based on this data.</span>
    <span class="nf">send_http_request</span><span class="p">(</span><span class="o">&amp;</span><span class="n">config</span><span class="py">.url1</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>
    <span class="nf">send_http_request</span><span class="p">(</span><span class="o">&amp;</span><span class="n">config</span><span class="py">.url2</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>
    <span class="nf">send_http_request</span><span class="p">(</span><span class="o">&amp;</span><span class="n">config</span><span class="py">.url3</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>

    <span class="nf">Ok</span><span class="p">(())</span>
<span class="p">}</span>
</code></pre></div></div>

<p>In this function, we’ve moved all of the configuration-related code (parsing, loading, even the static regex) into the block.
This works because Rust lets you have items, statements and expressions inside of a block, hence why we were able to move everything
inside. This pattern has three immediate advantages:</p>

<ul>
  <li>The block starts with the intent of the code (<code class="language-plaintext highlighter-rouge">let config = ...</code>). We can see that we’re working to resolve some kind of
configuration object right off the bat. Only then do we move into the implementation details of the code.</li>
  <li>It reduces pollution of the namespace of both the <code class="language-plaintext highlighter-rouge">foo</code> function and the top-level module. Now in <code class="language-plaintext highlighter-rouge">foo</code>, the variable names
<code class="language-plaintext highlighter-rouge">config_data</code>, <code class="language-plaintext highlighter-rouge">config_string</code> et al are no longer used. In addition to allowing these variable names to be re-used, it
makes this code a lot more “idiot-proof”. If someone else were to edit the <code class="language-plaintext highlighter-rouge">foo</code> function, they would only be able to use
<code class="language-plaintext highlighter-rouge">config</code>. They wouldn’t be able to use the <code class="language-plaintext highlighter-rouge">raw_data</code> or <code class="language-plaintext highlighter-rouge">STRIP_COMMENTS</code> items, which are only meant to be used by the
<code class="language-plaintext highlighter-rouge">config</code> parser.</li>
  <li>The variables <code class="language-plaintext highlighter-rouge">raw_data</code> and <code class="language-plaintext highlighter-rouge">data_string</code> go out of scope at the end of the block, which means they are dropped, freeing
up resources.</li>
</ul>

<p>As an aside, all three of these advantages also come if you were to refactor the block out into its own function. However, this
pattern has two key advantages over that:</p>

<ul>
  <li>The code flow is still inline with the rest of the function. For shorter blocks, this improves reading comprehension, since it
means you don’t have to go to a different part of the code to fully understand the function.</li>
  <li>If there are a lot of variables that the block would use, it prevents needing to explicitly name those variables as parameters.</li>
</ul>

<p>There is one more benefit that’s not exposed in the above example: erasure of mutability. Let’s say you construct some object for
use in a later part of the function:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="k">mut</span> <span class="n">data</span> <span class="o">=</span> <span class="nd">vec!</span><span class="p">[];</span>
<span class="n">data</span><span class="nf">.push</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
<span class="n">data</span><span class="nf">.extend_from_slice</span><span class="p">(</span><span class="o">&amp;</span><span class="p">[</span><span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">7</span><span class="p">]);</span>

<span class="n">data</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.for_each</span><span class="p">(|</span><span class="n">x</span><span class="p">|</span> <span class="nd">println!</span><span class="p">(</span><span class="s">"{x}"</span><span class="p">));</span>
<span class="k">return</span> <span class="n">data</span><span class="p">[</span><span class="mi">2</span><span class="p">];</span>
</code></pre></div></div>

<p>The issue is that <code class="language-plaintext highlighter-rouge">data</code> is declared as mutable, which means the rest of the function can mutate it. Since a lot of bugs come from
data being mutated when it isn’t supposed to be mutated, we’d like to restrict the mutability of the data to a certain area of the
function. This is also possible with the block pattern:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">data</span> <span class="o">=</span> <span class="p">{</span>
    <span class="k">let</span> <span class="k">mut</span> <span class="n">data</span> <span class="o">=</span> <span class="nd">vec!</span><span class="p">[];</span>
    <span class="n">data</span><span class="nf">.push</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
    <span class="n">data</span><span class="nf">.extend_from_slice</span><span class="p">(</span><span class="o">&amp;</span><span class="p">[</span><span class="mi">4</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="mi">6</span><span class="p">,</span> <span class="mi">7</span><span class="p">]);</span>
    <span class="n">data</span>
<span class="p">};</span>

<span class="n">data</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.for_each</span><span class="p">(|</span><span class="n">x</span><span class="p">|</span> <span class="nd">println!</span><span class="p">(</span><span class="s">"{x}"</span><span class="p">));</span>
<span class="k">return</span> <span class="n">data</span><span class="p">[</span><span class="mi">2</span><span class="p">];</span>
</code></pre></div></div>

<p>This effectively “closes” the mutability to a certain section of the function.</p>

<h2 id="closing-thoughts">Closing Thoughts</h2>

<p>I don’t know if this pattern is already well known to the Rust community. Even if it is, I figure it’s still a good idea to bring it
to people who may be inexperienced in Rust.</p>]]></content><author><name>John Nunley</name></author><category term="rust" /><summary type="html"><![CDATA[Here’s a little idiom that I haven’t really seen discussed anywhere, that I think makes Rust code much cleaner and more robust.]]></summary></entry><entry><title type="html">How to deal with Rust dependencies</title><link href="http://notgull.net/rust-dependencies/" rel="alternate" type="text/html" title="How to deal with Rust dependencies" /><published>2025-06-01T00:00:00+00:00</published><updated>2025-06-01T00:00:00+00:00</updated><id>http://notgull.net/rust-dependencies</id><content type="html" xml:base="http://notgull.net/rust-dependencies/"><![CDATA[<p>I hate to be the first one to tell you this, but Rust projects tend to have a
lot of dependencies.</p>

<p><em>Disclaimer:</em> I will take a look at some publicly available Rust crates today.
Please do not harass their authors, maintainers or users.</p>

<p>I’m not kidding. Let’s check out <a href="https://github.com/BurntSushi/ripgrep"><code class="language-plaintext highlighter-rouge">ripgrep</code></a>, one of the most popular Rust
programs of all time. We can check the number of dependencies fairly easily by
cloning the project, then running <code class="language-plaintext highlighter-rouge">cargo tree</code> through a nightmarish <code class="language-plaintext highlighter-rouge">sed</code>
invocation.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>git clone https://github.com/BurntSushi/ripgrep
<span class="nv">$ </span>cargo tree <span class="nt">-e</span> no-dev <span class="nt">--prefix</span> none | <span class="se">\</span>
    <span class="nb">sed</span> <span class="nt">-e</span> <span class="s1">'s/(\*)//g'</span> <span class="nt">-e</span> <span class="s1">'s/^[ \t]*//;s/[ \t]*$//'</span> | <span class="se">\</span>
    <span class="nb">sort</span> <span class="nt">-u</span> | <span class="nb">wc</span> <span class="nt">-l</span>
33
</code></pre></div></div>

<p>To break down each step of the shell command:</p>

<ul>
  <li><code class="language-plaintext highlighter-rouge">cargo tree</code> prints the dependencies of the current Cargo package in a “tree”
format, that lets you see which dependencies are brought in by which other
dependencies. <code class="language-plaintext highlighter-rouge">-e no-dev</code> skips dev dependencies, which are only used for
testing. By default <code class="language-plaintext highlighter-rouge">cargo</code> uses a tree formatting that we can skip using the
option <code class="language-plaintext highlighter-rouge">--prefix none</code>.</li>
  <li><code class="language-plaintext highlighter-rouge">sed</code> is a simple core utility that lets you run regular expressions on each
line of a command’s output. Each expression is denoted by <code class="language-plaintext highlighter-rouge">-e</code>.
    <ul>
      <li>The first expression, <code class="language-plaintext highlighter-rouge">s/(\*)//g</code>, is a simple replacement that says “replace
(<code class="language-plaintext highlighter-rouge">s</code>) every instance of <code class="language-plaintext highlighter-rouge">(*)</code> with nothing, <code class="language-plaintext highlighter-rouge">g</code>lobally”. <code class="language-plaintext highlighter-rouge">cargo tree</code> adds
this to repeated dependencies and I’d like to remove that.</li>
      <li>The second expression <code class="language-plaintext highlighter-rouge">s/^[ \t]*//;s/[ \t]*$//</code> removes whitespace at the
start and end of each line. This just cleans up the output in such a way
that it doesn’t mess with our uniqueness test later.</li>
    </ul>
  </li>
  <li><code class="language-plaintext highlighter-rouge">sort -u</code> sorts every line and deletes duplicated lines.</li>
  <li><code class="language-plaintext highlighter-rouge">wc -l</code> prints the total number of lines.</li>
</ul>

<p>Thirty-tree dependencies isn’t <em>that</em> bad for a highly advanced text matching
program. Most of the dependencies are things like the Rust <a href="https://crates.io/crates/regex"><code class="language-plaintext highlighter-rouge">regex</code></a> engine, or
<a href="https://crates.io/crates/aho-corasick">highly optimized text matching libraries</a>.
Frankly, I can find a fairly good reason for these dependencies. Then again, it’s
a small command line utility, so it’s not like I’d <em>expect</em> it to have a million
dependencies. Great job!</p>

<p>So let’s try something a little more complicated. Specifically, let’s try a 
networking application, since those are known for having unneeded dependencies.
I’ll pick one off the <a href="https://github.com/rust-unofficial/awesome-rust">awesome list</a>…
ah, <a href="https://github.com/svenstaro/miniserve"><code class="language-plaintext highlighter-rouge">miniserve</code></a>.</p>

<p>It should be a relatively small application, right? So let’s check the dependency
count.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>git clone https://github.com/svenstaro/miniserve
<span class="nv">$ </span>cargo tree <span class="nt">--no-default-features</span> <span class="nt">-e</span> no-dev <span class="nt">--prefix</span> none | <span class="se">\</span>
    <span class="nb">sed</span> <span class="nt">-e</span> <span class="s1">'s/(\*)//g'</span> <span class="nt">-e</span> <span class="s1">'s/^[ \t]*//;s/[ \t]*$//'</span> | <span class="se">\</span>
    <span class="nb">sort</span> <span class="nt">-u</span> | <span class="nb">wc</span> <span class="nt">-l</span>
281
</code></pre></div></div>

<p>I specifically added <code class="language-plaintext highlighter-rouge">--no-default-features</code> to remove unneeded dependencies,
and we still have a grand total of two hundred and eighty one dependencies.
That’s quite a few! Looking into the dependency list, we can see:</p>

<ul>
  <li>A <a href="https://crates.io/crates/fast_qr">QR code generator</a>.</li>
  <li>The <a href="https://crates.io/crates/rand"><code class="language-plaintext highlighter-rouge">rand</code></a> crate, both version v0.8.5 and
version v0.9.1, as well as <a href="https://crates.io/crates/fastrand"><code class="language-plaintext highlighter-rouge">fastrand</code></a>.</li>
  <li><a href="https://crates.io/crates/actix-web"><code class="language-plaintext highlighter-rouge">actix-web</code></a>, which brings in all of
<a href="https://crates.io/crates/tokio"><code class="language-plaintext highlighter-rouge">tokio</code></a> and various crates for handling
internet edge cases.</li>
  <li>Further duplicates of <a href="https://crates.io/crates/base64"><code class="language-plaintext highlighter-rouge">base64</code></a>,
<a href="https://crates.io/crates/hashbrown"><code class="language-plaintext highlighter-rouge">hashbrown</code></a>,
<a href="https://crates.io/crates/syn"><code class="language-plaintext highlighter-rouge">syn</code></a> and
<a href="https://crates.io/crates/zerocopy"><code class="language-plaintext highlighter-rouge">zerocopy</code></a>.</li>
</ul>

<p>I can justify a lot of these dependencies. The internet is a complicated place,
and anything that needs to face the public ‘net and respond to 99% of web clients
needs to deal with that. Still, thats 281 pieces of code that the maintainers have
to audit. Imagine if one of those dependencies is compromised, or just becomes
unmaintained. Not to mention the fact that there are two copies of these pieces
of code compiled into every one of these crates.</p>

<p>I’m not here to gawk at dependency graphs. Anyone
can do that. I’d like to identify the scope of the problem, and see if we can
figure out some solutions.</p>

<h2 id="dependency-dragback">Dependency Dragback</h2>

<p>To be clear, this problem isn’t just something randoms complain about. I’ve
been on both sides of this issue. I’ve been in security audits where “the
amount of code that gets pulled in” is an active demerit against integrating
Rust into a project. I’ve also been the author of libraries where lowering the
dependency count becomes a big issue.</p>

<p>I’d also like to be clear that this isn’t a problem unique to Rust. People have
been complaining about extraneous packages on JavaScript and Python ever since
they got package managers. JavaScript is famous for its “leftpad” incident, as
well as simple pieces of code requiring two hundred dependencies. Even C++ runs
into dependency problems once you add Boost to the equation.</p>

<p>In all of these languages, there’s two types of dependencies, in my opinion.</p>

<ul>
  <li>Dependencies that do something you don’t want to do yourself. You don’t want
to implement an HTTP server, so you pull in <a href="https://crates.io/crates/axum"><code class="language-plaintext highlighter-rouge">axum</code></a>.</li>
  <li>Dependencies that act as the canonical interface to some system facility or
hardware device. Think the C library for making system calls<sup id="fnref:syscall" role="doc-noteref"><a href="#fn:syscall" class="footnote" rel="footnote">1</a></sup>, or the
OpenGL library for putting things on the screen with the GPU.</li>
</ul>

<p>To be clear, there are very good reasons to pull in the first kind of dependency.
For things like cryptography or networking, writing low-level operations yourself
is a one-way trip to security-breach-ville. Even for less critical operations,
it’s usually better to use a tried-and-true, tested library than going and writing
it yourself.</p>

<p>There’s also something to be said here about code reuse; there’s no reason why
there should be more than one implementation of some algorithm in the Rust
community, necessitating the splitting of work between experts. Why have one
group of people review one piece of code and have another group of people review
another piece of nearly identical code, when it would be better to have both
groups review only one crate. This idea is the core reason why the
<a href="https://crates.io/crates/x11rb-protocol"><code class="language-plaintext highlighter-rouge">x11rb-protocol</code></a> crate was created.</p>

<p>That being said, there are some thing that are just so trivial that using separate
crates for them is far-and-away overkill. I’ve had to deal with crates that 
depend on heavy hitters like <a href="https://crates.io/crates/regex"><code class="language-plaintext highlighter-rouge">regex</code></a> and <a href="https://crates.io/crates/nom"><code class="language-plaintext highlighter-rouge">nom</code></a> for byte-munching operations
that were able to be implemented in basic slicing.</p>

<p>The worst offender for this problem is <a href="https://crates.io/crates/scopeguard"><code class="language-plaintext highlighter-rouge">scopeguard</code></a>.
There are very few use cases where this crate is economical over a few extra lines
of simple Rust code. Here is a quick polyfill:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Before:</span>
<span class="nn">scopeguard</span><span class="p">::</span><span class="nd">defer!</span> <span class="p">{</span>
  <span class="nf">do_the_thing</span><span class="p">();</span>
<span class="p">}</span>

<span class="c1">// After:</span>
<span class="k">struct</span> <span class="n">CallOnDrop</span><span class="o">&lt;</span><span class="n">F</span><span class="p">:</span> <span class="nf">FnMut</span><span class="p">()</span><span class="o">&gt;</span><span class="p">(</span><span class="n">F</span><span class="p">);</span>
<span class="k">impl</span><span class="o">&lt;</span><span class="n">F</span><span class="p">:</span> <span class="nf">FnMut</span><span class="p">()</span><span class="o">&gt;</span> <span class="nb">Drop</span> <span class="k">for</span> <span class="n">CallOnDrop</span><span class="o">&lt;</span><span class="n">F</span><span class="o">&gt;</span> <span class="p">{</span>
  <span class="k">fn</span> <span class="nf">drop</span><span class="p">()</span> <span class="p">{</span>
    <span class="p">(</span><span class="k">self</span><span class="na">.0</span><span class="p">)();</span>
  <span class="p">}</span>
<span class="p">}</span>

<span class="k">let</span> <span class="n">_bomb</span> <span class="o">=</span> <span class="nf">CallOnDrop</span><span class="p">(||</span> <span class="nf">do_the_thing</span><span class="p">());</span>
</code></pre></div></div>

<h2 id="safety-spinoff">Safety Spinoff</h2>

<p>Going back to my earlier point. These two types of dependencies exist in pretty
much all languages. I argue that Rust has a third type:</p>

<ul>
  <li>“Safety quarantine” crates that wrap some unsafe features in a safe wrapper.</li>
</ul>

<p>There are simpler ones like <a href="https://crates.io/crates/bytemuck"><code class="language-plaintext highlighter-rouge">bytemuck</code></a> which just wrap around simple data
transmutations in a way that the standard library hasn’t gotten around to
exposing yet. Then there are the C library wrappers like <a href="https://crates.io/crates/zstd"><code class="language-plaintext highlighter-rouge">zstd</code></a> which take
a C library and make it safe. I’m largely talking about the <a href="https://crates.io/crates/bytemuck"><code class="language-plaintext highlighter-rouge">bytemuck</code></a>-type
wrappers here.</p>

<p>When used effectively, these kind of “safety quarantine” crates let you isolate
unsafe code to specific parts of your dependency graph so you can be sure the rest
is safe. If the other crates in your dependency tree use <code class="language-plaintext highlighter-rouge">#![forbid(unsafe_code)]</code>,
you can be sure that any behavioral problems will only come from those
“quarantined” unsafe code crates.</p>

<p>In the dependency tree of <a href="https://crates.io/crates/smol"><code class="language-plaintext highlighter-rouge">smol</code></a>, we use this
strategy to great effect. It’s impossible to create a performant executor without
some level of unsafe code. So we isolate all of the unsafe code to
<a href="https://crates.io/crates/async-task"><code class="language-plaintext highlighter-rouge">async-task</code></a>, so <a href="https://crates.io/crates/smol"><code class="language-plaintext highlighter-rouge">async-executor</code></a>
can have almost no unsafe code. Similar, efficient lock-free channels require
unsafe code to implement on some level. So we isolate all of the unsafe code to
<a href="https://crates.io/crates/concurrent-queue"><code class="language-plaintext highlighter-rouge">concurrent-queue</code></a> so that the
<a href="https://crates.io/crates/async-channel"><code class="language-plaintext highlighter-rouge">async-channel</code></a> crate can be
<code class="language-plaintext highlighter-rouge">#![forbid(unsafe_code)]</code>. While <a href="https://crates.io/crates/smol"><code class="language-plaintext highlighter-rouge">smol</code></a> may have quite a few dependencies, many
of these dependencies exist in such a way that it completely eliminates unsafe
code.</p>

<p>I think it would be nicer if more crates did this. <a href="https://crates.io/crates/eframe"><code class="language-plaintext highlighter-rouge">eframe</code></a>, one of the more
popular GUI crates on the market, has around 120 dependencies. I wonder how much
more palatable it would be if more of those dependencies were written with
entirely safe code.</p>

<h2 id="what-do">What do?</h2>

<p>Still, many crates end up just having a <em>smidgeon</em> of unsafe code in them,
however justified. Not to mention, safe code is code you still have to audit.
So while there’s nothing wrong with handfuls of micro-crates that each do something
well, we should still seek a way to reduce the amount of dependencies in our
dependency tree.</p>

<p>The first measure we can take is somewhat obvious, for both application and
library developers. It’s possible to minimize features, which often minimizes
dependencies. You can run <code class="language-plaintext highlighter-rouge">cargo add</code> with the <code class="language-plaintext highlighter-rouge">--no-default-features</code>
flag, like so:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>cargo add nom <span class="nt">--no-default-features</span>
</code></pre></div></div>

<p>In my own applications, I usually start by adding dependencies without default
features, and then adding features piecemeal whenever I need them. Because of
this, I can usually keep my dependency tree to a minimum.</p>

<p>When this fails, don’t be afraid to switch to another, parallel crate. For many
dependency-heavy crates, there’s usually an alternative crate with much fewer
dependencies, often while retaining the core functionality you depend on. For
example, instead of <a href="https://crates.io/crates/futures"><code class="language-plaintext highlighter-rouge">futures</code></a>, consider using <a href="https://crates.io/crates/futures-lite"><code class="language-plaintext highlighter-rouge">futures-lite</code></a>. Instead of
<a href="https://crates.io/crates/actix-web"><code class="language-plaintext highlighter-rouge">actix-web</code></a>, maybe see if your use case can be fulfilled by <a href="https://crates.io/crates/axum"><code class="language-plaintext highlighter-rouge">axum</code></a>.</p>

<p>I’ve found that, with these two strategies, you can minimize the dependencies of
your crate or application. Usually, to a much more manageable level.</p>

<div class="footnotes" role="doc-endnotes">
  <ol>
    <li id="fn:syscall" role="doc-endnote">
      <p>Technically doesn’t apply to Linux, but if you’re going with the “raw system call” you’re probably using <a href="https://crates.io/crates/rustix"><code class="language-plaintext highlighter-rouge">rustix</code></a> or something anyhow. <a href="#fnref:syscall" class="reversefootnote" role="doc-backlink">&#8617;</a></p>
    </li>
  </ol>
</div>]]></content><author><name>John Nunley</name></author><category term="rust" /><category term="smol" /><summary type="html"><![CDATA[I hate to be the first one to tell you this, but Rust projects tend to have a lot of dependencies.]]></summary></entry><entry><title type="html">async/await versus the Calloop Model</title><link href="http://notgull.net/calloop/" rel="alternate" type="text/html" title="async/await versus the Calloop Model" /><published>2025-05-18T00:00:00+00:00</published><updated>2025-05-18T00:00:00+00:00</updated><id>http://notgull.net/calloop</id><content type="html" xml:base="http://notgull.net/calloop/"><![CDATA[<p>Of the two models for asynchronous programs, which one works better for your usecase?</p>

<p><img src="/images/notgull-calloop.jpg" alt="Finger Gun" /></p>

<p>This post is a long overdue follow-up to my <a href="/blocking-leaky">earlier post</a> 
about how blocking code is a leaky abstraction. I’ve written <a href="/why-not-threads">a lot</a>
<a href="/why-you-want-async">of words</a> on this blog defending one of Rust’s most
controversial features: <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> syntax. Many view it as an overcomplication
in an otherwise elegant language; I see it as the missing piece thats lets you
model asynchronous functionality and treat I/O operations as data.</p>

<p>However, in these blogposts I realize that I have made a significant error. I have
presented <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> as the <em>only</em> way to write asynchronous programs in Rust.
There are many other ways to write programs with non-standard control flow in Rust;
in fact, many of these strategies predate <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> as a whole.</p>

<p>Most of these strategies rely on a callback-based event loop; where you pass a
callback to an event source, and then that event source triggers the callback
when an event happens. This is a tried and true model; <a href="https://en.wikipedia.org/wiki/Libuv"><code class="language-plaintext highlighter-rouge">libuv</code></a> is an example
of an implementation of callback loops that’s used in <a href="https://www.esparkinfo.com/software-development/technologies/nodejs/statistics">millions</a> of applications
by way of <a href="https://nodejs.org/en">node.js</a>. As for Rust, <a href="https://github.com/rust-windowing/winit"><code class="language-plaintext highlighter-rouge">winit</code></a> and the appropriately named <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a>
are examples of popular callback event loops.</p>

<p>For this post, I’d like to take a closer look at <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a>. I feel as though it’s
a pretty good example of callback event loops, and in fact the Linux backend for
<a href="https://github.com/rust-windowing/winit"><code class="language-plaintext highlighter-rouge">winit</code></a> is built on <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a>. I’ll compare <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a> to <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code>
(or specifically <a href="https://github.com/smol-rs/smol"><code class="language-plaintext highlighter-rouge">smol</code></a>) and see what applications in which they’re a good fit.</p>

<p><em>Disclaimer: I maintain an async runtime, <a href="https://github.com/smol-rs/smol"><code class="language-plaintext highlighter-rouge">smol</code></a>. However, I also maintain <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a> and <a href="https://github.com/rust-windowing/winit"><code class="language-plaintext highlighter-rouge">winit</code></a> (although I am on hiatus). So I am biased in both directions.</em></p>

<h2 id="breakdown-boogie">Breakdown Boogie</h2>

<p>What you must first understand is that the underlying design philosophies for
<code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> and <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a> have very different goals.</p>

<p><code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> was designed specifically for networking applications; where you
have to scale to handling millions of requests at once across dozens of CPU
cores. Some writers have asserted that this means async I/O for smaller use
cases is a <a href="https://shnatsel.medium.com/smoke-testing-rust-http-clients-b8f2ee5db4e6">“should be a weird thing that you resort to for niche use cases”</a>.
I argue that it means you can scale with relative ease. If you know code can
handle 5,000,000 concurrent tasks, that means it can handle 5 with no issues.</p>

<p><a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a>, meanwhile, is designed for single threaded applications. As in, an
application where all of your logic runs on one single CPU core. This does not
necessarily mean the application is slower or meant to be used less; 
<a href="https://redis.io/">Redis</a> notably <a href="https://redis.io/blog/redis-architecture-13-years-later/">runs on a single-threaded architecture</a>. Although, <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a> is explicitly not
meant for performance; this is made quite explicit in <a href="https://github.com/Smithay/calloop/blob/master/README.md">its documentation</a>:</p>

<blockquote>
  <p>The main target use of this event loop is thus for apps that expect to spend most of their time waiting for events and wishes to do so in a cheap and convenient way. It is not meant for large scale high performance IO.</p>
</blockquote>

<p><code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> is the definite winner when it comes to high performance applications
that expect to scale vertically. However, the <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a> model is used more
frequently in the GUI ecosystem. This makes sense once you realize <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a>
was created by the same person who created <a href="https://github.com/smithay/wayland-rs">the Rust Wayland implementation</a>
and <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a> is effectively a Rust implementation of the <a href="https://wayland-book.com/wayland-display/event-loop.html">Wayland event loop</a>.
As I mentioned earlier, <a href="https://github.com/rust-windowing/winit"><code class="language-plaintext highlighter-rouge">winit</code></a> is built on <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a>.</p>

<p>In the past, I’ve <a href="/async-gui">proposed</a> that we could work <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> into
the GUI ecosystem and extolled the possible benefits. Needless to say, I think
<code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> could have its place in the GUI ecosystem; a place that <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a>
has traditionally occupied.</p>

<h2 id="advantage-async">Advantage Async</h2>

<p>To cut this off at the pass; you don’t have to exclusively use <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code>
or <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a> in your program. They are compatible! Best friends! Roommates! <del>smooching</del></p>

<p><a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a> goes out of its way to add <a href="https://docs.rs/calloop/latest/calloop/#asyncawait-compatibility"><code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> compatibility</a>,
making it so you can easily run <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>s inside of the event loop. Meanwhile,
you can use the <a href="https://crates.io/crates/async-io"><code class="language-plaintext highlighter-rouge">async-io</code></a> crate to poll an <a href="https://docs.rs/calloop/latest/calloop/struct.EventLoop.html"><code class="language-plaintext highlighter-rouge">EventLoop</code></a> on certain platforms:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">async_io</span><span class="p">::</span><span class="n">Async</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">calloop</span><span class="p">::</span><span class="n">EventLoop</span><span class="p">;</span>

<span class="c1">// Wrap the event loop into the `smol` runtime.</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">event_loop</span> <span class="o">=</span> <span class="nn">Async</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">EventLoop</span><span class="p">::</span><span class="nf">try_new</span><span class="p">()</span><span class="o">?</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>

<span class="c1">// Dispatch events when needed.</span>
<span class="k">loop</span> <span class="p">{</span>
    <span class="n">event_loop</span><span class="nf">.read_with</span><span class="p">(|</span><span class="n">event_loop</span><span class="p">|</span> <span class="p">{</span>
        <span class="n">event_loop</span><span class="nf">.dispatch</span><span class="p">(</span><span class="nb">None</span><span class="p">,</span> <span class="o">&amp;</span><span class="k">mut</span> <span class="p">())</span>
    <span class="p">})</span><span class="k">.await</span><span class="o">?</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Since GUI programs are <a href="https://devblogs.microsoft.com/oldnewthing/20071018-00/?p=24743">mostly single threaded</a>,
<code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code>’s multithreaded advantages don’t really apply. But that doesn’t
mean <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> is powerless here; there are <a href="https://crates.io/crates/unsend">single threaded runtimes</a>.
The main disadvantage of these runtimes is that <a href="https://doc.rust-lang.org/std/cell/struct.RefCell.html"><code class="language-plaintext highlighter-rouge">Waker</code></a>, the backbone of <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code>, 
is multithreaded and assumes it’s going to be sent to other threads. This condition
requires that you make all of your primitives dependent on synchronous primitives like <a href="https://doc.rust-lang.org/std/sync/struct.Mutex.html"><code class="language-plaintext highlighter-rouge">Mutex</code></a>,
which adds a performance drag to the program. <a href="https://doc.rust-lang.org/std/task/struct.LocalWaker.html"><code class="language-plaintext highlighter-rouge">LocalWaker</code></a> would fix this issue if it’s ever merged to
mainline Rust.</p>

<p>There are two principle advantages to <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> in this model. The first
is that <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> brings <a href="/why-not-threads">composability</a>, a property
I believe is valuable in GUI applications. <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a> is admittedly not composable,
preferring <a href="https://github.com/Smithay/calloop/issues/207#issuecomment-2310397119">multiple event sources</a>
over being able to chain event sources together.</p>

<p>I see libraries like <a href="https://github.com/AccessKit/accesskit"><code class="language-plaintext highlighter-rouge">accesskit</code></a>,
<a href="https://docs.rs/ui-events-winit/0.1.0/ui_events_winit/"><code class="language-plaintext highlighter-rouge">ui_events_winit</code></a> and
<a href="https://docs.rs/wgpu/24.0.3/wgpu/"><code class="language-plaintext highlighter-rouge">wgpu</code></a> that are <em>begging</em> for composability. <a href="https://docs.rs/wgpu/24.0.3/wgpu/"><code class="language-plaintext highlighter-rouge">wgpu</code></a> even has an official
<a href="https://github.com/gfx-rs/wgpu/wiki/Encapsulating-Graphics-Work">middleware pattern</a>.
Not to mention how composable GUI widgets need to be able to layer on top of eachother
in order to actually be useful.</p>

<p>The second advantage is easy integration with the rest of the Rust <code class="language-plaintext highlighter-rouge">async</code>
ecosystem. Many useful GUI apps end up having to make network calls at some
point; imagine being able to seamlessly make network requests from your GUI
application without freezing up like every other Win32 application seems to do.
Not to mention, the <code class="language-plaintext highlighter-rouge">async</code> ecosystem has <a href="https://docs.rs/async-executor/latest/async_executor/">mature event delivery mechanisms</a>
that could solve the “event delivery” problem Rust GUI frequently has.</p>

<p>I’ve already written about this at length <a href="/async-gui">here</a>, if you’re
interested in me going into more detail.</p>

<h2 id="shared-state-scenario">Shared State Scenario</h2>

<p>However, <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a> has one <em>crucial</em> advantage that <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> will never
have. One critical strength that means there are very good reasons why programmers
will and <em>should</em> choose <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a> for many use cases. The silver bullet, the
kryptonite: <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a> allows for shared state in a way that <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> does not.</p>

<p><a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a> is designed around having a shared structure that’s passed around
to every event source. You pass it some shared state in <code class="language-plaintext highlighter-rouge">dispatch</code>…</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">MyState</span> <span class="p">{</span>
    <span class="n">name</span><span class="p">:</span> <span class="nb">String</span><span class="p">,</span>
    <span class="n">counter</span><span class="p">:</span> <span class="nb">i32</span><span class="p">,</span>
    <span class="n">foobar</span><span class="p">:</span> <span class="n">File</span>
<span class="p">}</span>

<span class="k">let</span> <span class="k">mut</span> <span class="n">state</span> <span class="o">=</span> <span class="n">MyState</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">};</span>
<span class="n">event_loop</span><span class="nf">.dispatch</span><span class="p">(</span><span class="nb">None</span><span class="p">,</span> <span class="o">&amp;</span><span class="k">mut</span> <span class="n">state</span><span class="p">);</span>
</code></pre></div></div>

<p>…and suddenly, every event source has direct <code class="language-plaintext highlighter-rouge">&amp;mut</code> access to <code class="language-plaintext highlighter-rouge">state</code>. This
sharing works because <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a> is just calling event sources in order on a
single thread in a loop; it can trivially pass <code class="language-plaintext highlighter-rouge">&amp;mut MyState</code> to an event source
while it’s polling it for completion.</p>

<p><code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> has no good answer for this. Usually, network applications are
designed in accordance with the <a href="https://en.wikipedia.org/wiki/Actor_model">actor model</a>,
where each task has its own specific state and only shares it with other tasks
via channels. Many GUI applications are designed like this as well. But many more
are designed around only having one single blob of mutable state that every widget
<em>needs</em> to have access to.</p>

<p>Granted, <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> has no problem sharing immutable state (via immutable
references, with <code class="language-plaintext highlighter-rouge">&amp;state</code>). Using primitives like <a href="https://doc.rust-lang.org/std/cell/struct.RefCell.html"><code class="language-plaintext highlighter-rouge">RefCell</code></a> in single-threaded
setting, it’s possible to turn this into mutable state. But this solution
introduces ugly, brittle interior mutability into the Rust program. It’s a pale
shadow of what <a href="https://github.com/Smithay/calloop"><code class="language-plaintext highlighter-rouge">calloop</code></a> is able to achieve so effortlessly.</p>

<p>There is <em>some</em> form of shared state in <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code>, via the <a href="https://doc.rust-lang.org/std/task/struct.Context.html"><code class="language-plaintext highlighter-rouge">Context</code></a>
parameter that is passed to all <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>s. But as of the time of writing, this
value only holds a <a href="https://doc.rust-lang.org/std/cell/struct.RefCell.html"><code class="language-plaintext highlighter-rouge">Waker</code></a>, and is often re-created wholesale during things
like async task polling.</p>

<p>There is a proposed <a href="https://github.com/rust-lang/rust/issues/123392"><code class="language-plaintext highlighter-rouge">ext()</code></a>
method that would allow holding arbitrary extension data inside of the <a href="https://doc.rust-lang.org/std/task/struct.Context.html"><code class="language-plaintext highlighter-rouge">Context</code></a>.
However, even this is a shallow imitation. It only holds an <code class="language-plaintext highlighter-rouge">&amp;mut dyn Any</code>, a type
erased value that will need to be cast into whatever value you need. Even if you’re
okay with that, it will take some work for <code class="language-plaintext highlighter-rouge">ext()</code> to be handled by the Rust <code class="language-plaintext highlighter-rouge">async</code>
ecosystem.</p>

<p>So yes, <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> has no refutal for this problem.</p>

<h2 id="conclusion">Conclusion</h2>

<p>While <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> holds many advantages for programs that are using the
actor model, it will take some work and compromise for it to be integrated into
programs that use a significant amount of shared state. I cope by saying shared
state is an antipattern in Rust anyways.</p>]]></content><author><name>John Nunley</name></author><category term="async" /><category term="calloop" /><category term="rust" /><category term="smol" /><summary type="html"><![CDATA[Of the two models for asynchronous programs, which one works better for your usecase?]]></summary></entry><entry><title type="html">notgull versus burnout</title><link href="http://notgull.net/burnout/" rel="alternate" type="text/html" title="notgull versus burnout" /><published>2025-05-08T00:00:00+00:00</published><updated>2025-05-08T00:00:00+00:00</updated><id>http://notgull.net/burnout</id><content type="html" xml:base="http://notgull.net/burnout/"><![CDATA[<p>Around November of last year I was afflicted with a crippling case of burnout.
I want to talk about it.</p>

<p><img src="/images/notgull-crashout.jpg" alt="Crashout" /></p>

<p><em>Alt: notgull slumped on the floor holding a bottle of Absolut Vodka.</em></p>

<p>In terms of open source code, 2023 was fairly decent for me. I was the
maintainer of two popular projects that were used in a number of real
world usecases. I was also working on some new, exciting projects that took
these existing usecases and did something interesting with them. I even had
consistent posts on this blog, which brought me some attention from people I
really respected. By all records, I was doing great.</p>

<p>In 2024, I suddenly stopped contributing so much. Projects that I was previously
spending a significant amount of time on suddenly stopped receiving updates.
PRs went unreviewed. Issues went unanswered. I had people asking me if something
was wrong. I went five months on this blog without a single post, with the
<a href="/blocking-leaky">one post</a> just being a completion of a mostly-finished
article I had before.</p>

<p>By 2025, I had announced that I was <a href="https://hachyderm.io/@notgull/114152193300248947">going into hiatus</a>
and stepped back from all of my projects. Even then, this announcement was
largely a formalization of the existing state of affairs. Any project that I had
been maintaining was effectively abandoned.</p>

<p>I’d like to analyze my burnout in greater depth. Specifically I’d like to go
into the experience of being burnt out, and the reasons why I felt this way.
Mostly, just as a way of getting all of those emotions into a rational form.
Partially, so other people can avoid falling into the same pitfalls I did.</p>

<p><em>In case you can’t tell, this one is going to be very personal, in a way that articles on this blog haven’t been so far. Feel free to skip if that’s not your thing.</em></p>

<h2 id="prelude">Prelude</h2>

<p>At the beginning of 2024, I still felt pretty good.</p>

<p>The past couple of years had been pretty transformative for me. I’d been writing
open source software for while, but not the kind of software that ends up being
used. It all started when the fantastic <a href="https://github.com/taiki-e">Taiki Endo</a>
had offered me a maintainer spot in <a href="https://github.com/smol-rs"><code class="language-plaintext highlighter-rouge">smol-rs</code></a>. I was
still writing code, but all of a sudden, people were actually taking that code and
<em>doing</em> something with it. 
Like if you walked into your model train set, and all of a sudden there were
actual passengers using it to get to their day jobs. I felt ecstatic.</p>

<p>In the time since then I’d become somewhat well known in the Rust community for maintaining <code class="language-plaintext highlighter-rouge">smol</code>. I’d
also started writing code for <a href="https://github.com/rust-windowing/"><code class="language-plaintext highlighter-rouge">rust-windowing</code></a>,
further expanding my repertoire. This led to a couple of cool one-off projects,
like <a href="https://github.com/notgull/theo"><code class="language-plaintext highlighter-rouge">theo</code></a> and <a href="https://github.com/notgull/async-winit"><code class="language-plaintext highlighter-rouge">async-winit</code></a>.
Even if those projects wouldn’t see wide use, you don’t know how fun it was to
tinker with a codebase like <a href="https://github.com/notgull/theo"><code class="language-plaintext highlighter-rouge">theo</code></a> and watch it grow into something actually
usable.</p>

<p>Not to mention, I also started writing this blog! It’s quite rewarding to write
something, and then see people actually share your opinions. It’s like seeing
yourself on TV.</p>

<p>I think it’s a very common experience for an open-source programmer to start at
a programming job, and then just completely halt their open-source activity. I
started at my current position at my current company in the summer of 2023, after
I’d graduated college. Even with my full-time job, I was still writing open
source code and keeping a pretty good output overall. I’d felt like I’d beaten
the odds; I could have my cake and eat it too.</p>

<p>Even going into 2024, I still felt like I was on top of the world. At some point I promised myself
that I would write one blogpost every two weeks. Of course, that didn’t pan out,
but who cares? I was still writing tons of code, both for my job and for the
open-source community. I could keep going forever if I wanted to.</p>

<p>So why did everything change?</p>

<h2 id="the-experience-of-being-burnt-out">The Experience of being Burnt Out</h2>

<p>All of a sudden, I was having trouble getting into my open source projects. The
passion that had previously pushed me forwards was waning. I still liked reading
through issues and solving them. But it started to take a grating cadence in my
mind; it was something I’m doing because I was forcing myself to, not because I
wanted to.</p>

<p>I still kept at it, though. I’m nothing if not stubborn. But that approach
slowly wore down. The act of reviewing pull requests became a chore that I
never looked forward to.</p>

<p>By May, I could barely bring myself to look at the code. Even opening GitHub
induced a visceral reaction. Reading through the code that I’d previously enjoyed 
skimming for fun was like trudging through molasses. A struggle.</p>

<p>In a few months time, I couldn’t even bring myself to look at the <code class="language-plaintext highlighter-rouge">smol</code>
codebase.</p>

<p>Lack of progress brought shame, and shame induced further lack of progress. I knew there were issues piling up on my
projects, but what could I do about them? Like sewage water filling up my
neighbor’s apartment, there were urgent problems but I couldn’t do anything
about them. Those tasks just seemed to loom large on my to-do list. Like
statues that I had been tasked to knock down myself, with nothing but a single
sledge hammer and a bottle of hydrochloric acid.</p>

<p>Maybe I could’ve kept going. But at some point I just realized, I couldn’t even
feign it anymore.</p>

<p>Maybe this was why I publicly announced <a href="/announcing-dozer">dozer</a>, which was a mistake in my
opinion. <a href="/announcing-dozer">dozer</a> wasn’t in anywhere near the state where it would be productive
to have multiple people working on it. I ended up having to completely re-do
the backend if I wanted to handle even a <em>fraction</em> of Rust’s core language. A
re-do that is, to this day, incomplete.</p>

<p>I’m sure you know the experience of starting a new project. All of the potential
in front of you, ready to be seized and turned into something wonderful. It’s a
high like pure crack cocaine. But soon you’re hit with the cold reality that
you can’t just build castles in the clouds. Eventually you’re going to have to
implement the bits of the project you never expected to implement, or completely
change your plans to fit your API.</p>

<p>In the end, you eventually have to slog through
those bits. Especially when you’re already burnt out from other projects, each of
those bits feels like you’re carving them with a dark obsidian knife out of your very soul.</p>

<p>In March of 2025, I did something long overdue. I announced my hiatus, and said
I was taking a break. This, thankfully, relieved the pressure that I felt on me,
real or imagined. It’s given me time to think, time to digest my experience.</p>

<p>I feel better now, by the way. I’m not sure I want to return from my hiatus just
yet; I imagine it will be a gradual return once I am ready. But, I’m writing code
again. For fun! I’m excited to see where it goes.</p>

<h2 id="the-cause-of-burnout">The Cause of Burnout</h2>

<p>In the above paragraphs, I realize I’ve made an error; I’ve made it seem like
burnout was an inevitable fact, a direct cause-an-effect of long-term open source
development. I don’t think it is. I’d like to go into certain external factors
within my life and see what I can pick out.</p>

<p>Obviously, there’s work. Around the start of 2024, my responsibilities were
suddenly expanded. To make a <em>very</em> long story short, I’d been assigned to a 
project that was an order of magnitude more important and more pressurizing
than any I’d been assigned before. I found myself writing and testing a significant
amount of code, and it began to eat into my energy budget.</p>

<p>I genuinely enjoy writing code. As I’ve alluded above it’s like building a model train set
that people can actually ride. But, even I have my limits as to how much code
I can write in day.</p>

<p>I’ve been told this is likely an example of the <a href="https://en.wikipedia.org/wiki/Overjustification_effect">overjustification effect</a>. If
you give something with intrinsic motivation to perform a task incentives to
complete that task, they will lose that intrinsic motivation.</p>

<p>As an example, let’s say you see a kid who plays on a swing in the park every
day. You go up to the kid and say “I will give you twenty dollars every time I
see you playing on this swing”. Of course the kid will keep playing on the swing
when you come by every day. It’s a win-win!</p>

<p>But then let’s say you stop coming by with the money. Will the kid keep playing on
the swing? No. Because at some point it stopped being something the kid did for
fun and started being a chore they had to do for money.</p>

<p>Right now I feel like I may be the kid on the swing, and someone’s suddenly
offered me a lot of money to keep swinging.</p>

<p>But maybe it’s unfair to blame it all on work. While the projects I’m assigned
to may be intense, I still find them quite fun. To be clear, I think the
overjustification effect is <em>part</em> of it, but not quite the whole problem.</p>

<p>Circa April 2024, I was going through something that I would quite describe as
a downward spiral, but it certainly felt like one at the time. I can describe it
as “finding out a lot of things the hard way”.</p>

<p>In the end, I had the feeling of
falling even though I was standing on solid ground. This kind of panic
definitely contributed to my burnout, along with a handful of longstanding
personal issues I was going through at the time.</p>

<p>It’s never just one thing, it seems, that overwhelms the defenses. It’s always
at least two or three.</p>

<h2 id="what-now">What now?</h2>

<p>As I said before, I’m feeling better. A lot of things in my life have stabilized
for the time being, and I’m finally seeing a psychologist to help me through my
varied emotions. Finally, I’m feeling that passion that originally drew me to open
source come back.</p>

<p>I don’t want to immediately burn through that passion, though. I want to nurture
it and let it grow again, like it did the first time. I’m slowly but surely
taking up some of my projects again. Maybe someday we’ll see a half-decent <code class="language-plaintext highlighter-rouge">dozer</code>,
or a <code class="language-plaintext highlighter-rouge">smol-rs</code> as lively as it used to be.</p>

<p>Maybe 2025 will be the year of the notgull.</p>]]></content><author><name>John Nunley</name></author><category term="personal" /><summary type="html"><![CDATA[Around November of last year I was afflicted with a crippling case of burnout. I want to talk about it.]]></summary></entry><entry><title type="html">Blocking code is a leaky abstraction</title><link href="http://notgull.net/blocking-leaky/" rel="alternate" type="text/html" title="Blocking code is a leaky abstraction" /><published>2024-10-19T00:00:00+00:00</published><updated>2024-10-19T00:00:00+00:00</updated><id>http://notgull.net/blocking-leaky</id><content type="html" xml:base="http://notgull.net/blocking-leaky/"><![CDATA[<p>Asynchronous code does not require the rest of your code to be asynchronous.
I can’t say the same for blocking code.</p>

<p><strong>Disclaimer:</strong> I am one of the maintainers for <a href="https://crates.io/crates/smol"><code class="language-plaintext highlighter-rouge">smol</code></a>, a small and fast <code class="language-plaintext highlighter-rouge">async</code> runtime for Rust.</p>

<p>I’ve been involved in the Rust community for four years at this point. At this
point, I’ve seen a lot of criticism of <code class="language-plaintext highlighter-rouge">async</code>. I’ve found it to be an
<a href="https://notgull.net/why-you-want-async/">elegant model for programming</a> that
<a href="https://notgull.net/why-not-threads/">easily outclasses alternatives</a>. I use it
frequently in my own programs when it fits. There are a lot of programs that
would be improved with the presence of <code class="language-plaintext highlighter-rouge">async</code>, that don’t use it because
people are scared of it. In fact, many organizations have a “hard ban” on
<code class="language-plaintext highlighter-rouge">async</code> code.</p>

<p>Some of this criticism is valid. <code class="language-plaintext highlighter-rouge">async</code> code is a little hard to wrap your head around,
but that’s true with many other concepts in Rust, like the borrow checker and
the weekly blood sacrifices. Many popular <code class="language-plaintext highlighter-rouge">async</code> libraries are explicitly tied to heavyweight
crates like <a href="https://crates.io/crates/tokio"><code class="language-plaintext highlighter-rouge">tokio</code></a> and <a href="https://crates.io/crates/future"><code class="language-plaintext highlighter-rouge">futures</code></a>, which aren’t good picks for many types of 
programs. There are also a lot of language features that need to be released for
<code class="language-plaintext highlighter-rouge">async</code> to be used without an annoying amount of <code class="language-plaintext highlighter-rouge">Box</code>ing <code class="language-plaintext highlighter-rouge">dyn</code>amic objects.</p>

<p>There’s one point, though, that I’ve heard quite frequently at this point. I think
it’s misleading. Let’s talk about it.</p>

<h2 id="whats-in-a-leak">What’s in a leak?</h2>

<p>I’ve seen a lot of people say that <code class="language-plaintext highlighter-rouge">async</code> is a “leaky abstraction”. What this means
is that the presence of <code class="language-plaintext highlighter-rouge">async</code> in a program forces you to bend the program’s
control flow to accommodate it. If you have 100 files in your program, and <em>one</em>
of those files uses <code class="language-plaintext highlighter-rouge">async</code>, you have to either write the <em>entire</em> program in
<code class="language-plaintext highlighter-rouge">async</code> or resort to bohemian, mind-bending hacks to contain it. Just like an
inlaw moving into your spare bedroom.</p>

<p>I do not mean memory leaks, which is what happens if you fail to free memory that you
allocate. Neither <code class="language-plaintext highlighter-rouge">async</code> nor blocking code has a problem with memory leaking intrinsically.</p>

<blockquote>
  <p><img src="/images/ddog.jpg" alt="Dependency Dog" width="100" />
<strong>Dependency Dog:</strong> If you want to see a good example of a leaky abstraction, consider <a href="https://developer.apple.com/documentation/appkit">AppKit</a>. Not only is AppKit thread-unsafe to the point where many functions can only safely be called on the <code class="language-plaintext highlighter-rouge">main()</code> thread, it forces you into Apple’s terrifying Objective-C model. Basically any program that wants to have working GUI on macOS now needs to interface in Apple’s way, with basically no alternatives.</p>
</blockquote>

<p>I’ve seen the <a href="https://journal.stuffwithstuff.com/2015/02/01/what-color-is-your-function/">“What Color is Your Function?”</a>
blogpost by Bob Nystrom referenced a lot in these discussions. This blogpost was
originally written with JavaScript’s callbacks in mind. Fair enough. The callback
model is hard to deal with, and its enduring popularity in the Rust ecosystem is something
I have to write a blogpost about. He also mentions <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> as a potential
solution to this problem, although one that he is unsatisfied with, as it still
divides the ecosystem into asynchronous and synchronous halves.</p>

<p>While this blogpost may be correct when it comes to JavaScript and other, higher-level
languages, I believe that Rust stands out in such a way that it’s not true for this language.
In fact, I believe the opposite is true. Non-<code class="language-plaintext highlighter-rouge">async</code> code (or “blocking” code)
is the real leaky abstraction.</p>

<h2 id="object-class-safe">Object Class: Safe</h2>

<p>I’d like to discuss how you call blocking code from <code class="language-plaintext highlighter-rouge">async</code> code, and vice versa.
That way we can compare.</p>

<p>Let’s make a table to describe how it goes calling functions from one “color” to
another. You can call blocking code from blocking code without any issues. You
can also call asynchronous code from asynchronous code trivially. There is also
a strategy for calling asynchronous code from blocking code that I will go into
shortly. So our table looks like this:</p>

<table>
  <thead>
    <tr>
      <th>→ calls ↓  code</th>
      <th><code class="language-plaintext highlighter-rouge">async</code></th>
      <th>blocking</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td><code class="language-plaintext highlighter-rouge">async</code></td>
      <td>Trivial</td>
      <td>Generally Easy</td>
    </tr>
    <tr>
      <td>blocking</td>
      <td><em>We’ll see…</em></td>
      <td>Trivial</td>
    </tr>
  </tbody>
</table>

<p>Note that not all code fits cleanly into the <code class="language-plaintext highlighter-rouge">async</code>/blocking categories. A notorious
example is GUI code, which uses blocking semantics but overall acts a lot like <code class="language-plaintext highlighter-rouge">async</code>
code in that it’s not allowed to block. But that’s a topic for another post.</p>

<p>When you write an <code class="language-plaintext highlighter-rouge">async</code> function, it returns a <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>, which represents a
value that will eventually be resolved. There are a lot of things you can do with
a <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>. You can race it against another <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>, spawn it on an executor,
and any number of other operations. It’s a point I delve deeper into in
<a href="https://notgull.net/why-you-want-async/">this post</a>.</p>

<p>However, one of the simpler operations is to just wait for a <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> to
complete. Often, the waiting is done by blocking the current thread. So by
<a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.block_on.html">“blocking on”</a> the <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>, we can effectively turn an <code class="language-plaintext highlighter-rouge">async</code> function into
a synchronous call.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="k">fn</span> <span class="nf">my_async_code</span><span class="p">()</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">}</span>

<span class="k">fn</span> <span class="nf">my_main_blocking_code</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">use</span> <span class="nn">futures_lite</span><span class="p">::</span><span class="nn">future</span><span class="p">::</span><span class="n">block_on</span><span class="p">;</span>
    <span class="nf">block_on</span><span class="p">(</span><span class="nf">my_async_code</span><span class="p">());</span>
<span class="p">}</span>
</code></pre></div></div>

<p><a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.block_on.html"><code class="language-plaintext highlighter-rouge">block_on</code></a> takes any <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>, whether it’s <code class="language-plaintext highlighter-rouge">!Send</code> or not <code class="language-plaintext highlighter-rouge">'static</code> or if
it’s about to explode. So literally any <code class="language-plaintext highlighter-rouge">async</code> function can be called from
synchronous code.</p>

<p>It’s relatively simple, too. <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.block_on.html"><code class="language-plaintext highlighter-rouge">block_on</code></a> is implemented like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">pub</span> <span class="k">fn</span> <span class="n">block_on</span><span class="o">&lt;</span><span class="n">T</span><span class="o">&gt;</span><span class="p">(</span><span class="n">future</span><span class="p">:</span> <span class="k">impl</span> <span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span> <span class="o">=</span> <span class="n">T</span><span class="o">&gt;</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="n">T</span><span class="p">{</span>
    <span class="c1">// A `Context` with a `Waker` is needed to poll a `Future`.</span>
    <span class="k">let</span> <span class="n">waker</span> <span class="o">=</span> <span class="nf">waker_that_blocks_current_thread</span><span class="p">();</span>
    <span class="k">let</span> <span class="k">mut</span> <span class="n">context</span> <span class="o">=</span> <span class="nn">Context</span><span class="p">::</span><span class="nf">from_waker</span><span class="p">(</span><span class="o">&amp;</span><span class="n">waker</span><span class="p">);</span>

    <span class="nn">std</span><span class="p">::</span><span class="nn">pin</span><span class="p">::</span><span class="nd">pin!</span><span class="p">(</span><span class="n">future</span><span class="p">);</span> <span class="c1">// This used to require `unsafe` code, but doesn't anymore!</span>

    <span class="c1">// Poll the future in a loop, blocking the thread while we wait.</span>
    <span class="k">loop</span> <span class="p">{</span>
        <span class="k">match</span> <span class="n">future</span><span class="nf">.as_mut</span><span class="p">()</span><span class="nf">.poll</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="n">context</span><span class="p">)</span> <span class="p">{</span>
            <span class="nn">Poll</span><span class="p">::</span><span class="nf">Ready</span><span class="p">(</span><span class="n">value</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="k">return</span> <span class="n">value</span><span class="p">,</span>
            <span class="nn">Poll</span><span class="p">::</span><span class="n">Pending</span> <span class="k">=&gt;</span> <span class="nf">block_thread_until_waker_wakes_us</span><span class="p">(),</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<blockquote>
  <p><img src="/images/ddog.jpg" alt="Dependency Dog" width="100" />
<strong>Dependency Dog:</strong> The actual <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.block_on.html"><code class="language-plaintext highlighter-rouge">block_on</code></a> is a little more complicated. It has some logic to reuse the <code class="language-plaintext highlighter-rouge">waker</code> between function calls, to reduce the overhead to one thread-local key access and nothing else.</p>
</blockquote>

<p>Okay, but what if you don’t want <a href="https://crates.io/crates/futures-lite"><code class="language-plaintext highlighter-rouge">futures_lite</code></a> in your dependency tree? <a href="https://crates.io/crates/futures-lite"><code class="language-plaintext highlighter-rouge">futures_lite</code></a>
isn’t the heaviest dependency on the block (that’s <a href="https://crates.io/crates/future"><code class="language-plaintext highlighter-rouge">futures</code></a>), but it’s still a
non-negligible amount of code. No need to worry! There’s also <a href="https://crates.io/crates/pollster"><code class="language-plaintext highlighter-rouge">pollster</code></a>, which
has zero (required) dependencies and consists of less than 100 lines of code.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="nf">my_main_blocking_code</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">use</span> <span class="nn">pollster</span><span class="p">::</span><span class="n">block_on</span><span class="p">;</span>
    <span class="nf">block_on</span><span class="p">(</span><span class="nf">my_async_code</span><span class="p">());</span>
<span class="p">}</span>
</code></pre></div></div>

<p>So, calling <code class="language-plaintext highlighter-rouge">async</code> code from blocking code is easy. Just call <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.block_on.html"><code class="language-plaintext highlighter-rouge">block_on</code></a>. It’s that simple!</p>

<h2 id="its-not-that-simple">It’s not that simple</h2>

<p>Of course it’s not that simple. I’m sure people familiar with actually calling
<code class="language-plaintext highlighter-rouge">async</code> code from blocking code are screaming at the screen right now. So let’s
address that.</p>

<p>There are a substantial number of <code class="language-plaintext highlighter-rouge">async</code> crates out there that run on top of
<a href="https://crates.io/crates/tokio"><code class="language-plaintext highlighter-rouge">tokio</code></a>. They use <a href="https://crates.io/crates/tokio"><code class="language-plaintext highlighter-rouge">tokio</code></a>’s primitives, <a href="https://crates.io/crates/tokio"><code class="language-plaintext highlighter-rouge">tokio</code></a>’s executor, and
<a href="https://crates.io/crates/tokio"><code class="language-plaintext highlighter-rouge">tokio</code></a>’s I/O semantics. Because of this, they rely on <a href="https://crates.io/crates/tokio"><code class="language-plaintext highlighter-rouge">tokio</code></a>’s runtime to
be running in the background. If you try the above strategy for a crate that
relies on <a href="https://crates.io/crates/tokio"><code class="language-plaintext highlighter-rouge">tokio</code></a>, it will fail at runtime with a panic.</p>

<p>No need to fear. We can start a <a href="https://crates.io/crates/tokio"><code class="language-plaintext highlighter-rouge">tokio</code></a> runtime and let it
peacefully run in the background, forever. The libraries are able to pick up on
this runtime and use it.</p>

<p>In <code class="language-plaintext highlighter-rouge">main()</code>, during your program initialization, put this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">std</span><span class="p">::{</span><span class="n">future</span><span class="p">,</span> <span class="n">thread</span><span class="p">};</span>

<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="c1">// Create a runtime.</span>
    <span class="k">let</span> <span class="n">rt</span> <span class="o">=</span> <span class="nn">tokio</span><span class="p">::</span><span class="nn">runtime</span><span class="p">::</span><span class="nn">Builder</span><span class="p">::</span><span class="nf">new_current_thread</span><span class="p">()</span>
        <span class="nf">.enable_all</span><span class="p">()</span>
        <span class="nf">.build</span><span class="p">()</span>
        <span class="nf">.unwrap</span><span class="p">();</span>

    <span class="c1">// Clone a handle to the runtime and send it to another thread.</span>
    <span class="nn">thread</span><span class="p">::</span><span class="nf">spawn</span><span class="p">({</span>
        <span class="k">let</span> <span class="n">handle</span> <span class="o">=</span> <span class="n">rt</span><span class="nf">.handle</span><span class="p">()</span><span class="nf">.clone</span><span class="p">();</span>

        <span class="c1">// Run the handle on this thread, forever.</span>
        <span class="k">move</span> <span class="p">||</span> <span class="n">handle</span><span class="nf">.block_on</span><span class="p">(</span><span class="nn">future</span><span class="p">::</span><span class="nn">pending</span><span class="p">::</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span><span class="p">())</span>
    <span class="p">});</span>

    <span class="c1">// "Enter" the runtime and let it sit there.</span>
    <span class="k">let</span> <span class="n">_guard</span> <span class="o">=</span> <span class="n">rt</span><span class="nf">.enter</span><span class="p">();</span>

    <span class="c1">// Block on any futures.</span>
    <span class="nn">pollster</span><span class="p">::</span><span class="nf">block_on</span><span class="p">(</span><span class="nf">my_async_function</span><span class="p">());</span>
<span class="p">}</span>
</code></pre></div></div>

<p>For any <code class="language-plaintext highlighter-rouge">block_on</code> calls in your application, the runtime will already be
available. Note that you will need to call <code class="language-plaintext highlighter-rouge">enter()</code> on any new threads that use
<a href="https://crates.io/crates/tokio"><code class="language-plaintext highlighter-rouge">tokio</code></a> primitives. Thankfully you can get a <code class="language-plaintext highlighter-rouge">Handle</code> to the runtime, which
can be sent to any thread and is also cheaply clonable.</p>

<p>But that’s really it. Once you have the runtime humming away in the background,
<a href="https://crates.io/crates/tokio"><code class="language-plaintext highlighter-rouge">tokio</code></a> futures should Just Work!</p>

<p>As an aside, another hitch is that <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.block_on.html"><code class="language-plaintext highlighter-rouge">block_on</code></a> and functions like it are only available on <code class="language-plaintext highlighter-rouge">std</code>-enabled
platforms. But the <code class="language-plaintext highlighter-rouge">no_std</code> <code class="language-plaintext highlighter-rouge">async</code> story is a blogpost for another day.</p>

<h3 id="a-quick-segue-into-tokiomain">A Quick Segue into <code class="language-plaintext highlighter-rouge">tokio::main</code></h3>

<p>I’ve seen some people recommend using the <a href="https://docs.rs/tokio/latest/tokio/attr.main.html"><code class="language-plaintext highlighter-rouge">tokio::main</code></a> attribute to turn an
<code class="language-plaintext highlighter-rouge">async</code> function into a blocking function, then calling that from your real code.
For example:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[tokio::main(flavor</span> <span class="nd">=</span> <span class="s">"current_thread"</span><span class="nd">)]</span>
<span class="k">async</span> <span class="k">fn</span> <span class="nf">my_async_code</span><span class="p">()</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">}</span>

<span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="c1">// `tokio::main` transparently converts `my_async_code` into a blocking function.</span>
    <span class="nf">my_async_code</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>It’s a little impressive, if not a little hacky. The <code class="language-plaintext highlighter-rouge">async</code> function is turned into
a blocking function using the proc macro.</p>

<p>But… just don’t do this. It means that, every time <code class="language-plaintext highlighter-rouge">my_async_code</code> is called, it
spins up a <a href="https://crates.io/crates/tokio"><code class="language-plaintext highlighter-rouge">tokio</code></a> runtime, runs the code, then immediately throws that runtime
away. For functions that are called a lot, it really adds up. In addition it makes
the function signature misleading. It’s a blocking function, not an <code class="language-plaintext highlighter-rouge">async</code> function!</p>

<h2 id="meanwhile-for-blocking-code">Meanwhile, for blocking code…</h2>

<p>First off, I find <code class="language-plaintext highlighter-rouge">async</code> code to be more predictable than blocking code, in a
weird way. Look at this function signature:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="k">fn</span> <span class="nf">my_async_function</span><span class="p">()</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">}</span>
</code></pre></div></div>

<p>What does this tell you? I know that the <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> returned by this function
won’t block. I can place it in my executor of choice, or race it against any
other <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>s, without worrying that it will hog the execution loop. By
convention <code class="language-plaintext highlighter-rouge">poll()</code> will probably run in a time period close to “instant”,
before yielding and then letting something else take over.</p>

<p>Yes, there are buggy <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>s out there. But well-formed <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>s
complete quickly.</p>

<p>Now look at this function signature:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="nf">my_function</span><span class="p">()</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">}</span>
</code></pre></div></div>

<p>By looking at this function signature, can you tell how long it will take to run?
Maybe it will complete instantly. Maybe it reads from a file and can potentially
take between a few microseconds to a few <em>whole</em> seconds, depending on the file
system. Maybe it blocks on a network socket. Maybe it processes a bunch of data
in a loop, meaning that for large datasets it could run for a long time.</p>

<p>Yes, you can check the docs. But the docs usually fail to mention <em>any</em> of the
above behavior, even for functions in the standard library. All of this ignores
behavior dependent on generics/traits, too. It doesn’t matter how well-formed it is,
you can’t tell how this function will act.</p>

<p>Often, when writing <code class="language-plaintext highlighter-rouge">async</code> programs, I have to be extra sure when I use blocking
functions that I’m not accidentally blocking, which would lock up the entire event
loop. In most cases this requires me to read the entire code of the function
to understand what can go wrong.</p>

<p>If I can’t be sure it won’t block, I’ll need to wrap it in a <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> that runs
it on its own isolated thread. <a href="https://crates.io/crates/smol"><code class="language-plaintext highlighter-rouge">smol</code></a> provides the <a href="https://crates.io/crates/blocking"><code class="language-plaintext highlighter-rouge">blocking</code></a> threadpool to
run code on other threads, while <a href="https://crates.io/crates/tokio"><code class="language-plaintext highlighter-rouge">tokio</code></a> has a <a href="https://docs.rs/tokio/latest/tokio/task/fn.spawn_blocking.html"><code class="language-plaintext highlighter-rouge">spawn_blocking</code></a> function.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">blocking</span><span class="p">::</span><span class="n">unblock</span><span class="p">;</span>

<span class="k">fn</span> <span class="nf">my_blocking_function</span><span class="p">()</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">}</span>

<span class="k">async</span> <span class="k">fn</span> <span class="nf">my_async_main</span><span class="p">()</span> <span class="p">{</span>
    <span class="nf">unblock</span><span class="p">(||</span> <span class="nf">my_blocking_function</span><span class="p">())</span><span class="k">.await</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>This method comes at a cost. At the very least it’s an allocation for the blocking
task’s state, as well as a few atomic operations to push it and then pop it from some thread
pool’s task queue. At worst it spawns an entire new thread. Compare this to the
cost of <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.block_on.html"><code class="language-plaintext highlighter-rouge">block_on</code></a> which is usually one thread-local access.</p>

<p>But wait! <a href="https://docs.rs/blocking/latest/blocking/fn.unblock.html"><code class="language-plaintext highlighter-rouge">unblock</code></a> will send the function to another thread to be run. So,
the function needs to be <code class="language-plaintext highlighter-rouge">Send</code> and <code class="language-plaintext highlighter-rouge">'static</code>. This strategy doesn’t even work
if the function relies on some kind of thread-unsafe state, like a <a href="https://doc.rust-lang.org/std/cell/struct.RefCell.html"><code class="language-plaintext highlighter-rouge">RefCell</code></a>.
If the function takes a reference to some data you may need to wrap it in an
<a href="https://doc.rust-lang.org/std/sync/struct.Arc.html"><code class="language-plaintext highlighter-rouge">Arc</code></a>&lt;<a href="https://doc.rust-lang.org/std/sync/struct.Mutex.html"><code class="language-plaintext highlighter-rouge">Mutex</code></a>&gt;.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">blocking</span><span class="p">::</span><span class="n">unblock</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">sync</span><span class="p">::{</span><span class="nb">Arc</span><span class="p">,</span> <span class="n">Mutex</span><span class="p">};</span>

<span class="k">fn</span> <span class="nf">my_blocking_function</span><span class="p">(</span><span class="n">data</span><span class="p">:</span> <span class="o">&amp;</span><span class="k">mut</span> <span class="n">Foo</span><span class="p">)</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">}</span>

<span class="k">async</span> <span class="k">fn</span> <span class="nf">my_async_main</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">data</span> <span class="o">=</span> <span class="nn">Arc</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Mutex</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="cm">/* ... */</span><span class="p">));</span>
    <span class="nf">unblock</span><span class="p">({</span>
        <span class="k">let</span> <span class="n">data</span> <span class="o">=</span> <span class="n">data</span><span class="nf">.clone</span><span class="p">();</span>
        <span class="k">move</span> <span class="p">||</span> <span class="nf">my_blocking_function</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="n">data</span><span class="nf">.lock</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">())</span>
    <span class="p">})</span><span class="k">.await</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>I know this is a common complain with <a href="https://crates.io/crates/tokio"><code class="language-plaintext highlighter-rouge">tokio</code></a>’s style of <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code>, but
it’s just as bad the other way as well.</p>

<p>For the record, you can call <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.block_on.html"><code class="language-plaintext highlighter-rouge">block_on</code></a> with any kind of borrowed data, with no
hassles.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="k">fn</span> <span class="nf">my_async_code</span><span class="p">(</span><span class="n">foo</span><span class="p">:</span> <span class="o">&amp;</span><span class="k">mut</span> <span class="n">Foo</span><span class="p">)</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">}</span>

<span class="k">fn</span> <span class="nf">my_main_blocking_code</span><span class="p">()</span> <span class="p">{</span>
    <span class="k">use</span> <span class="nn">futures_lite</span><span class="p">::</span><span class="nn">future</span><span class="p">::</span><span class="n">block_on</span><span class="p">;</span>

    <span class="k">let</span> <span class="k">mut</span> <span class="n">data</span> <span class="o">=</span> <span class="cm">/* ... */</span><span class="p">;</span>
    <span class="nf">block_on</span><span class="p">(</span><span class="nf">my_async_code</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="n">data</span><span class="p">));</span> <span class="c1">// This works!</span>
<span class="p">}</span>
</code></pre></div></div>

<p>In order to avoid these issues I often have to segment out code that might block
into their own sections. This lets me avoid the overhead of <a href="https://docs.rs/blocking/latest/blocking/fn.unblock.html"><code class="language-plaintext highlighter-rouge">unblock</code></a> for each
function as a bonus.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="nf">some_blocking_segment</span><span class="p">(</span><span class="k">mut</span> <span class="n">data</span><span class="p">:</span> <span class="n">Foo</span><span class="p">)</span> <span class="p">{</span>
    <span class="nf">do_something</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="n">data</span><span class="p">);</span>
    <span class="n">data</span><span class="nf">.postprocess</span><span class="p">();</span>
    <span class="nf">print_the_data</span><span class="p">(</span><span class="o">&amp;</span><span class="n">data</span><span class="p">);</span>
<span class="p">}</span>

<span class="k">async</span> <span class="k">fn</span> <span class="nf">my_async_main</span><span class="p">()</span> <span class="p">{</span>
    <span class="c1">// This doesn't work if `Foo` is `!Send`.</span>
    <span class="k">let</span> <span class="n">data</span> <span class="o">=</span> <span class="cm">/* ... */</span><span class="p">;</span>
    <span class="nf">unblock</span><span class="p">(</span>
        <span class="k">move</span> <span class="p">||</span> <span class="nf">my_blocking_function</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="n">data</span><span class="nf">.lock</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">())</span>
    <span class="p">})</span><span class="k">.await</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>

<p>However this requires me to re-architect parts of my code into these segments. It gets
difficult to interweave further <code class="language-plaintext highlighter-rouge">async</code> code into this sub-section as well. Yes,
I can call <code class="language-plaintext highlighter-rouge">async</code> functions from <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.block_on.html"><code class="language-plaintext highlighter-rouge">block_on</code></a>, but I’d really prefer to <code class="language-plaintext highlighter-rouge">.await</code> on it.</p>

<p>Say, doesn’t this seem very… <em>leaky</em>, to you?</p>

<h2 id="lets-fix-this">Let’s Fix This</h2>

<p>I don’t like to bring up a problem without also mentioning a possible solution.
I mentioned documentation above; it would be nice if there was some kind of
indicator that a function blocked.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cd">/// Does a thing.</span>
<span class="cd">/// </span>
<span class="cd">/// # Blocking</span>
<span class="cd">/// </span>
<span class="cd">/// This function will block the first time it is called, as it is reading from</span>
<span class="cd">/// `/dev/random` to seed the random number generator.</span>
<span class="k">fn</span> <span class="nf">my_blocking_function</span><span class="p">(</span><span class="n">data</span><span class="p">:</span> <span class="o">&amp;</span><span class="k">mut</span> <span class="n">Foo</span><span class="p">)</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">}</span>
</code></pre></div></div>

<p>It would be a Herculean effort, and I don’t think it’s a sustainable approach.
If you’re writing a higher level library, it would be a lot to ask to check if
your dependency’s dependency’s dependency maybe reads from a socket.</p>

<p>From a language standpoint, it would be nice if there was some kind of <code class="language-plaintext highlighter-rouge">#[blocking]</code>
attribute to indicate that a function blocked, like so:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[blocking]</span>
<span class="k">fn</span> <span class="nf">my_blocking_function</span><span class="p">(</span><span class="n">data</span><span class="p">:</span> <span class="o">&amp;</span><span class="k">mut</span> <span class="n">Foo</span><span class="p">)</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">}</span>
</code></pre></div></div>

<p>Maybe there could even be some kind of tree-traversal to see if you were calling a
<code class="language-plaintext highlighter-rouge">#[blocking]</code> function from <code class="language-plaintext highlighter-rouge">async</code> code, and then raise a warning. Unfortunately I’m
unsure if this would work either. There are function that might block once and never again,
or functions that only block under specific circumstances that the Rust compiler
can’t predict. Not to mention, it would be difficult to solve the problem of data
being processed in a tight loop.</p>

<p>So, I don’t know. There are some clever people on the language design team, so maybe
they have better ideas.</p>

<h2 id="parting-shots">Parting Shots</h2>

<p>Frankly, I don’t think <code class="language-plaintext highlighter-rouge">async</code> code is leaky at all, and the ways that it does leak
are largely due to library problems. Meanwhile blocking code leaks by its fundamental design.
I hope you found this helpful and that it might remove some reservations about using
<code class="language-plaintext highlighter-rouge">async</code> code in the future.</p>]]></content><author><name>John Nunley</name></author><category term="async" /><category term="rust" /><category term="smol" /><summary type="html"><![CDATA[Asynchronous code does not require the rest of your code to be asynchronous. I can’t say the same for blocking code.]]></summary></entry><entry><title type="html">Why am I writing a Rust compiler in C?</title><link href="http://notgull.net/announcing-dozer/" rel="alternate" type="text/html" title="Why am I writing a Rust compiler in C?" /><published>2024-08-25T00:00:00+00:00</published><updated>2024-08-25T00:00:00+00:00</updated><id>http://notgull.net/announcing-dozer</id><content type="html" xml:base="http://notgull.net/announcing-dozer/"><![CDATA[<p>To bootstrap Rust, no cost is too great.</p>

<p>Perceptive Rustaceans may have noticed my activity has gone down as of late.
There are a handful of different reasons for this. I’ve been the subject of a
truly apocalyptic series of life events, including the death of a relative that
rattled me to my core. I’ve had more responsibilities at work, leaving me with
less time and energy to contribute. Maybe I’ve also lost a little bit of the
college-kid enthusiasm that brought me to open source in the first place.</p>

<p>There’s another reason, too. I’ve been cooking up a project that’s been taking
up most of my time. It’s certainly the largest project I’ve created in the open
source world, and if I complete it, it will certainly be my crowning
achievement.</p>

<p>I am writing a Rust compiler in pure C. No C++. No <a href="https://en.wikipedia.org/wiki/Flex_(lexical_analyser_generator)"><code class="language-plaintext highlighter-rouge">flex</code></a> or <a href="https://en.wikipedia.org/wiki/Yacc"><code class="language-plaintext highlighter-rouge">yacc</code></a>. Not even a
<a href="https://en.wikipedia.org/wiki/Make_(software)"><code class="language-plaintext highlighter-rouge">Makefile</code></a>. Nothing but pure C.</p>

<p>It’s called <a href="https://codeberg.org/notgull/dozer">Dozer</a>.</p>

<h2 id="wait-why">Wait, Why?</h2>

<p>To understand why I’ve followed this path of madness, you first need to
understand bootstrapping and why it is important.</p>

<p>Let’s say that you’ve written some code in Rust. In order to run this code, you
need to <em>compile</em> it. A <em>compiler</em> is a program that parses your code, validates
its correctness, and then transforms it into machine code that the CPU can
understand.</p>

<blockquote>
  <p><img src="/images/ddog.jpg" alt="Dependency Dog" width="100" />
<strong>Dependency Dog:</strong> Yes, it’s significantly more complicated than that. Except when it’s less complicated than that. Compilers are tricky to even describe.</p>
</blockquote>

<p>For Rust, your main compiler is <a href="https://github.com/rust-lang/rust">rustc</a>. If you don’t know, this is the
underlying program that <code class="language-plaintext highlighter-rouge">cargo</code> calls when you run <code class="language-plaintext highlighter-rouge">cargo build</code>. It’s fantastic
software, and frankly a gem of the open source community. Its code quality is up
there with the Linux kernel and the Quake III source code.</p>

<p>However, <a href="https://github.com/rust-lang/rust">rustc</a> itself is a program. So it needs a compiler to compile it from
its source code to machine code. Say, what language <em>is</em> <a href="https://github.com/rust-lang/rust">rustc</a> written in?</p>

<p><img src="/images/rustc-in-rust.png" alt="rustc is 97.3 percent rust" /></p>

<p>Ah, <a href="https://github.com/rust-lang/rust">rustc</a> is a Rust program. Written in Rust, for the purpose of compiling
Rust code. But, think about this for a second. If <a href="https://github.com/rust-lang/rust">rustc</a> is written in Rust,
and <a href="https://github.com/rust-lang/rust">rustc</a> is needed to compile Rust code, that means you need to use <a href="https://github.com/rust-lang/rust">rustc</a>
to compile <a href="https://github.com/rust-lang/rust">rustc</a>. Which is fine for us users, since we can just download
<a href="https://github.com/rust-lang/rust">rustc</a> from the internet and use it.</p>

<p>But, who compiled the first <a href="https://github.com/rust-lang/rust">rustc</a>? There had to be a chicken before the egg,
right? Where does it start?</p>

<p>…</p>

<p>Actually, that’s fairly simple. Every new version of <a href="https://github.com/rust-lang/rust">rustc</a> was compiled with
the previous version of <a href="https://github.com/rust-lang/rust">rustc</a>. So <a href="https://github.com/rust-lang/rust">rustc</a> version 1.80.0 was compiled with
<a href="https://github.com/rust-lang/rust">rustc</a> version 1.79.0. Which was, in turn, compiled with <a href="https://github.com/rust-lang/rust">rustc</a> version
1.78.0. And so on and so forth, all the way back to <a href="https://github.com/rust-lang/rust/tree/ef75860a0a72f79f97216f8aaa5b388d98da6480">version 0.7</a>
if the compiler. At that point, the compiler was written in <a href="https://en.wikipedia.org/wiki/OCaml">OCaml</a>. So all you
needed was an OCaml compiler to get a fully functioning <a href="https://github.com/rust-lang/rust">rustc</a> program.</p>

<p>There, problem solved! We’ve figured out how to create <a href="https://github.com/rust-lang/rust">rustc</a> from first
principles! All is well, let’s go back to business.</p>

<p>Just one more thing. We still need a version of the <a href="https://en.wikipedia.org/wiki/OCaml">OCaml</a> compiler for all of
this to work. So what language is the <a href="https://en.wikipedia.org/wiki/OCaml">OCaml</a> compiler written in?</p>

<p><img src="/images/ocaml-in-ocaml.png" alt="OCaml is 84 percent OCaml" /></p>

<p><em>faceplant</em></p>

<p>Okay, okay, no worries! There is a <a href="https://github.com/Ekdohibs/camlboot">project</a>
that can successfully compile the <a href="https://en.wikipedia.org/wiki/OCaml">OCaml</a> compiler using <a href="https://en.wikipedia.org/wiki/GNU_Guile">Guile</a>, which is one of the many
variants of <a href="https://en.wikipedia.org/wiki/Scheme_(programming_language)">Scheme</a>, which is one of many variants of <a href="https://en.wikipedia.org/wiki/Lisp_(programming_language)">Lisp</a>. Not to mention,
<a href="https://en.wikipedia.org/wiki/GNU_Guile">Guile</a>’s interpreter is written in C.</p>

<p>So this brings us, as all eventually things do, to the C programming language. We just
compile it using <a href="https://en.wikipedia.org/wiki/GNU_Compiler_Collection">GCC</a>, and everything works out. So we just need to compile
<a href="https://en.wikipedia.org/wiki/GNU_Compiler_Collection">GCC</a>, which is written using… C++?!</p>

<p>Okay, that’s a little unfair. <a href="https://en.wikipedia.org/wiki/GNU_Compiler_Collection">GCC</a> was written in C until version 5, and it’s
not like there’s a shortage of C compilers written in C out there. For instance,
consider <a href="https://en.wikipedia.org/wiki/Tiny_C_Compiler">TinyCC</a>, which is written in C and handles not only compiling, but
assembly and linking too.</p>

<p>…but that still doesn’t answer our question. What was the first C compiler
written in? Assembly? Then what was the first assembler written in?</p>

<h2 id="the-descent-principle">The Descent Principle</h2>

<p>This is where we introduce the <a href="https://bootstrappable.org/">Bootstrappable Builds</a>
project. To me, this is one of the most fascinating projects in the open source
community. It’s basically code alchemy.</p>

<p>Their <a href="https://github.com/fosslinux/live-bootstrap">Linux bootstrap process</a>
starts with a 512-byte binary seed. This seed contains what’s possibly the
simplest compiler you can imagine: it takes hexadecimal digits and outputs the
corresponding raw bytes. As an example, here part of the “source code” that’s
compiled with this compiler.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>31 C0           # xor ax, ax
8E D8           # mov ds, ax
8E C0           # mov es, ax
8E D0           # mov ss, ax
BC 00 77        # mov sp, 0x7700
FC              # cld ; clear direction flag
88 16 15 7C     # mov [boot_drive], dl
</code></pre></div></div>

<p>Note that everything after the pound sign is a comment, and all whitespace is
stripped. Frankly, I’m not even sure this can be called a programming language.
Still, it is <em>technically</em> analyzable, dissectable source code.</p>

<p>From here, this compiler compiles a very simple operating system, a barebones
shell, and a slightly more advanced compiler. That compiler compiles a slightly
more advanced compiler. A few steps later, you have something that roughly
<em>looks</em> like assembly code.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>DEFINE cmp_ebx,edx 39D3
DEFINE je 0F84
DEFINE sub_ebx, 81EB

:loop_options
    cmp_ebx,edx                         # Check if we are done
    je %loop_options_done               # We are done
    sub_ebx, %2                         # --options
</code></pre></div></div>

<p>Man, it’s weird to think of <em>assembly code</em> as being higher-level than anything
else, right?</p>

<p>This is enough to get them to a very basic subset of C. Then they compile a
slightly more advanced C compiler written in this subset. A few steps later they
can compile <a href="https://en.wikipedia.org/wiki/Tiny_C_Compiler">TinyCC</a>. From there they can bootstrap <a href="https://en.wikipedia.org/wiki/Yacc"><code class="language-plaintext highlighter-rouge">yacc</code></a>, basic coreutils,
Bash, autotools, and eventually <a href="https://en.wikipedia.org/wiki/GNU_Compiler_Collection">GCC</a> and Linux.</p>

<p>I’m not doing this justice, it’s a fascinating process. Every step is listed
<a href="https://github.com/fosslinux/live-bootstrap/blob/master/parts.rst">here</a>.</p>

<p>Anyhow, you’ve essentially gone from “a binary blob small enough to be manually
analyzed” to Linux, GCC, and basically everything else. But let’s start again
from <a href="https://en.wikipedia.org/wiki/Tiny_C_Compiler">TinyCC</a>.</p>

<p>Right now, Rust shows up very late into this process. They use <a href="https://github.com/thepowersgang/mrustc">mrustc</a>, an
alternative Rust implementation written in C++ that can compile <a href="https://github.com/rust-lang/rust">rustc</a> version
1.56. From here, they then compile up to modern Rust code.</p>

<p>The main issue here is that, by the time C++ is introduced into the bootstrap
chain, the bootstrap is basically over. So if you wanted to use Rust at any
point before C++ is introduced, you’re out of luck.</p>

<p>So, for me, it would be <em>really nice</em> if there was a Rust compiler that could be
bootstrapped from C. Specifically, a Rust compiler that can be bootstrapped from
<a href="https://en.wikipedia.org/wiki/Tiny_C_Compiler">TinyCC</a>, while assuming that there are no tools on the system yet that could be
potentially useful.</p>

<p>That’s <a href="https://codeberg.org/notgull/dozer">Dozer</a>.</p>

<h2 id="the-plan">The Plan</h2>

<p>I’ve been working on <a href="https://codeberg.org/notgull/dozer">Dozer</a> for the past two months, putting my anemic free
time to work on writing in a language that I kind of hate.</p>

<blockquote>
  <p><img src="/images/ddog.jpg" alt="Dependency Dog" width="100" />
<strong>Dependency Dog:</strong> That’s a little unfair. C has some elegant qualities to it. Reality truly is what you make of it. It’s just that I would not let this code anywhere near production.</p>
</blockquote>

<p>It’s written with no extensions, and so far both <a href="https://en.wikipedia.org/wiki/Tiny_C_Compiler">TinyCC</a> and <a href="https://sr.ht/~mcf/cproc/">cproc</a> are able
to compile it with no issues. I’m using <a href="https://c9x.me/compile/">QBE</a> as a backend. Other than that, I
assume no tools exist on the system. Just a C compiler, some very basic shell
implementation, and nothing else.</p>

<p>I won’t get into the raw <em>experience</em> of writing a compiler in this blogpost.
But so far, I have the lexer done, as well as a sizable part of the parser.
Macro/module expansion is something I’m putting off as long as possible,
typechecking only supports <code class="language-plaintext highlighter-rouge">i32</code>, and codegen is a little bit rough. But it’s a
start.</p>

<p>I can successfully compile this code:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="nf">rust_main</span><span class="p">()</span> <span class="k">-&gt;</span> <span class="nb">i32</span> <span class="p">{</span>
    <span class="p">(</span><span class="mi">2</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="o">*</span> <span class="mi">6</span> <span class="o">+</span> <span class="mi">3</span>
<span class="p">}</span>
</code></pre></div></div>

<p>So, where to from here? Here’s my plan.</p>

<ul>
  <li>Slowly advance <a href="https://codeberg.org/notgull/dozer">Dozer</a> until it can compile some basic <code class="language-plaintext highlighter-rouge">libc</code>-using samples,
then <code class="language-plaintext highlighter-rouge">libcore</code>, then <a href="https://github.com/rust-lang/rust">rustc</a>.
    <ul>
      <li>For the record, I’m planning on compiling <a href="https://github.com/rust-lang/rust">rustc</a>’s <a href="https://github.com/rust-lang/rustc_codegen_cranelift">Cranelift</a>
backend, which is written entirely in Rust. Since we’re assuming we don’t
have C++ yet, we can’t compile LLVM.</li>
    </ul>
  </li>
  <li>Create a <code class="language-plaintext highlighter-rouge">cargo</code> equivalent that can use <a href="https://codeberg.org/notgull/dozer">Dozer</a> to compile Rust packages.</li>
  <li>Find out which sources in <a href="https://github.com/rust-lang/rust">rustc</a> are automaticaly generated and then strip
them out. By the Bootstrappable project’s rules, automatically generated code
is not allowed.</li>
  <li>Create a process that can be used to compile <a href="https://github.com/rust-lang/rust">rustc</a> and then <code class="language-plaintext highlighter-rouge">cargo</code>, then
use our compiled versions of <a href="https://github.com/rust-lang/rust">rustc</a>/<code class="language-plaintext highlighter-rouge">cargo</code> to re-compile canonical versions
of <a href="https://github.com/rust-lang/rust">rustc</a>/<code class="language-plaintext highlighter-rouge">cargo</code>.</li>
</ul>

<p>This will definitely be the hardest project I’ve ever undertaken. Part of me
doubts that I will be able to finish it. But you know what? It’s better to have
tried and lost than to never have tried at all.</p>

<p>Stay tuned for more <a href="https://codeberg.org/notgull/dozer">Dozer</a> updates, as well as an explanation of the
architecture I have planned.</p>]]></content><author><name>John Nunley</name></author><category term="c" /><category term="dozer" /><category term="rust" /><summary type="html"><![CDATA[To bootstrap Rust, no cost is too great.]]></summary></entry><entry><title type="html">I appeared on a podcast!</title><link href="http://notgull.net/qna-with-friends/" rel="alternate" type="text/html" title="I appeared on a podcast!" /><published>2024-08-13T00:00:00+00:00</published><updated>2024-08-13T00:00:00+00:00</updated><id>http://notgull.net/qna-with-friends</id><content type="html" xml:base="http://notgull.net/qna-with-friends/"><![CDATA[<p>I recently apperead on the “QnA With Friends” podcast, run by <a href="https://github.com/mamaicode">Irine</a>.</p>

<p>If you want to hear me talk about <a href="https://crates.io/crates/smol"><code class="language-plaintext highlighter-rouge">smol</code></a>, <a href="https://crates.io/crates/winit"><code class="language-plaintext highlighter-rouge">winit</code></a>, open source in general, and how I think
people should learn <code class="language-plaintext highlighter-rouge">async</code> Rust, consider giving it a watch!</p>

<iframe width="560" height="315" src="https://www.youtube.com/embed/EnWbnJXkOsg?si=MF4p4wd_8ZoN7xx1" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen=""></iframe>]]></content><author><name>John Nunley</name></author><category term="async" /><category term="rust" /><summary type="html"><![CDATA[I recently apperead on the “QnA With Friends” podcast, run by Irine.]]></summary></entry><entry><title type="html">Explaining the internals of async-task from the ground up</title><link href="http://notgull.net/async-task-explained-part1/" rel="alternate" type="text/html" title="Explaining the internals of async-task from the ground up" /><published>2024-03-30T00:00:00+00:00</published><updated>2024-03-30T00:00:00+00:00</updated><id>http://notgull.net/async-task-explained-part1</id><content type="html" xml:base="http://notgull.net/async-task-explained-part1/"><![CDATA[<p><a href="https://crates.io/crates/async-task"><code class="language-plaintext highlighter-rouge">async-task</code></a> is one of the most complicated crates in the <a href="https://crates.io/crates/smol"><code class="language-plaintext highlighter-rouge">smol</code></a> ecosystem. But, fundamentally, it’s just a future on the heap.</p>

<p>I pride myself on <a href="https://crates.io/crates/smol"><code class="language-plaintext highlighter-rouge">smol</code></a> packages being very easy to parse for anyone with a beginner’s level of experience in Rust. By that I mean, if you want to know how <a href="https://crates.io/crates/smol"><code class="language-plaintext highlighter-rouge">smol</code></a> works, it should be very easy to pick up the source code, read through it, and understand how each individual part works.</p>

<blockquote>
  <p><img src="/images/ddog.jpg" alt="Dependency Dog" width="100" />
<strong>Dependency Dog:</strong> Wait, do people normally read source code for fun?</p>
</blockquote>

<blockquote>
  <p><img src="/images/notgull.png" alt="Notgull" width="100" />
<strong>notgull:</strong> No, I think that’s just a “me” thing.</p>
</blockquote>

<p>There’s a few crates that are a little harder to take as bathroom reading. There’s <a href="https://crates.io/crates/polling"><code class="language-plaintext highlighter-rouge">polling</code></a>, which does a lot of low-level system interaction to make asynchronous I/O work. I’ve done my best to make it interesting, but there’s not a whole lot to say about a crate that’s basically following the OS’s instruction manual.</p>

<p>Then there’s <a href="https://crates.io/crates/async-task"><code class="language-plaintext highlighter-rouge">async-task</code></a>. <a href="https://crates.io/crates/async-task"><code class="language-plaintext highlighter-rouge">async-task</code></a>’s philosophy runs counter to the rest of <a href="https://crates.io/crates/smol"><code class="language-plaintext highlighter-rouge">smol</code></a>. When it comes to optimization, <a href="https://crates.io/crates/smol"><code class="language-plaintext highlighter-rouge">smol</code></a> generally tries to go for safety and reasonability over crazy optimizations with diminishing returns. For <a href="https://crates.io/crates/async-task"><code class="language-plaintext highlighter-rouge">async-task</code></a> however, we take the gloves off. We go all out to make sure tasks are as small and use as few resources as possible.</p>

<blockquote>
  <p><img src="/images/notgull.png" alt="Notgull" width="100" />
<strong>notgull:</strong> This is actually because <a href="https://crates.io/crates/async-task"><code class="language-plaintext highlighter-rouge">async-task</code></a> predates <a href="https://crates.io/crates/smol"><code class="language-plaintext highlighter-rouge">smol</code></a>! It was originally used as the task implementation for <a href="https://crates.io/crates/async-std"><code class="language-plaintext highlighter-rouge">async-std</code></a>.</p>
</blockquote>

<p>I’d like to provide this series of blogposts as a reference for how <a href="https://crates.io/crates/async-task"><code class="language-plaintext highlighter-rouge">async-task</code></a> works, how you might arrive to an implementation like <a href="https://crates.io/crates/async-task"><code class="language-plaintext highlighter-rouge">async-task</code></a> organically, and how it was optimized into its current state.</p>

<blockquote>
  <p><img src="/images/notgull.png" alt="Notgull" width="100" />
<strong>notgull:</strong> As a heads up: most posts for this blog assume an intermediate knowledge of Rust. However, this post is intended for readers who may not already be familiar with concepts like executors or dynamic dispatch.</p>

  <p>Of course, it may be a good idea to review the basics, even if you’re an expert.</p>
</blockquote>

<h2 id="background-basics">Background Basics</h2>

<p>Let’s say you have two <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>s; blocks of asynchronous code that can be ran concurrently. You want to run both of them at once.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Future #1</span>
<span class="k">let</span> <span class="n">foo</span> <span class="o">=</span> <span class="k">async</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">x</span> <span class="o">=</span> <span class="nf">my_function</span><span class="p">()</span><span class="k">.await</span><span class="p">;</span>
    <span class="nf">do_something</span><span class="p">(</span><span class="n">x</span><span class="p">)</span><span class="k">.await</span><span class="p">;</span>
<span class="p">};</span>

<span class="c1">// Future #2</span>
<span class="k">let</span> <span class="n">bar</span> <span class="o">=</span> <span class="k">async</span> <span class="p">{</span>
    <span class="k">for</span> <span class="n">_</span> <span class="k">in</span> <span class="mi">0</span><span class="o">..</span><span class="mi">50</span> <span class="p">{</span>
        <span class="nf">respond_to_user</span><span class="p">()</span><span class="k">.await</span><span class="p">;</span>
    <span class="p">}</span>
<span class="p">};</span>
</code></pre></div></div>

<blockquote>
  <p><img src="/images/ddog.jpg" alt="Dependency Dog" width="100" />
<strong>Dependency Dog:</strong> Wait… <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>? <code class="language-plaintext highlighter-rouge">async</code>? <code class="language-plaintext highlighter-rouge">await</code>? What’s that?</p>
</blockquote>

<blockquote>
  <p><img src="/images/notgull.png" alt="Notgull" width="100" />
<strong>notgull:</strong> They’re Rust’s user-space concurrency building blocks! If you need a refresher on what these mean, it may be worth it to read the <a href="https://rust-lang.github.io/async-book/">async book</a>.</p>
</blockquote>

<p>Running two futures at a time can be done very easily. First, we bring in the <a href="https://crates.io/crates/futures-lite"><code class="language-plaintext highlighter-rouge">futures-lite</code></a> crate:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>cargo add futures-lite
    Updating crates.io index
      Adding futures-lite v2.3.0 to dependencies
             Features:
             + alloc
             + fastrand
             + futures-io
             + parking
             + race
             + std
             - memchr
    Updating crates.io index
</code></pre></div></div>

<p>Then, we can use the <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.zip.html"><code class="language-plaintext highlighter-rouge">zip</code></a> combinator to run both <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>s at the same time.
Finally, we can use <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.block_on.html"><code class="language-plaintext highlighter-rouge">block_on</code></a> to poll the resulting <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> until it
completes. It looks like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">futures_lite</span><span class="p">::</span><span class="n">future</span><span class="p">;</span>

<span class="c1">// Run the two futures in parallel.</span>
<span class="k">let</span> <span class="n">combined</span> <span class="o">=</span> <span class="nn">future</span><span class="p">::</span><span class="nf">zip</span><span class="p">(</span><span class="n">foo</span><span class="p">,</span> <span class="n">bar</span><span class="p">);</span>

<span class="c1">// Block on the combined future until it completes.</span>
<span class="nn">future</span><span class="p">::</span><span class="nf">block_on</span><span class="p">(</span><span class="n">combined</span><span class="p">);</span>
</code></pre></div></div>

<p>How the <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.zip.html"><code class="language-plaintext highlighter-rouge">zip</code></a> combinator works is as follows:</p>

<ul>
  <li>It tries to poll the first <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>. If it is ready, it takes the result and
saves it the memory. It remembers not to poll the first <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> again.</li>
  <li>It does the same thing for the second <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>. It polls it if it hasn’t
finished. If it has, it saves the result.</li>
  <li>Once both <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>s are finished, it returns a tuple of the result.</li>
</ul>

<p>Using this strategy, we can poll two <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>s at the same time. The
following diagram shows what this looks like in practice:</p>

<p><img src="/images/async-task-zip.svg" alt="zip diagram" /></p>

<p>Note that, even though only one thread of execution is used, it appears as
though the futures are run at the same time.</p>

<h2 id="scalability-solutions">Scalability Solutions</h2>

<p>The <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.zip.html"><code class="language-plaintext highlighter-rouge">zip</code></a> combinator works for very simple cases of concurrency, but falls
apart for higher-level scenarios. Let’s say you want to poll four futures at once. (The horror!)</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">baz</span> <span class="o">=</span> <span class="k">async</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">};</span>
<span class="k">let</span> <span class="n">cap</span> <span class="o">=</span> <span class="k">async</span> <span class="p">{</span> <span class="cm">/* ... */</span> <span class="p">};</span>
</code></pre></div></div>

<p>Then, you would need to call <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.zip.html"><code class="language-plaintext highlighter-rouge">zip</code></a> three times!</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="n">combined</span> <span class="o">=</span> <span class="nn">future</span><span class="p">::</span><span class="nf">zip</span><span class="p">(</span>
    <span class="nn">future</span><span class="p">::</span><span class="nf">zip</span><span class="p">(</span><span class="n">foo</span><span class="p">,</span> <span class="n">bar</span><span class="p">),</span>
    <span class="nn">future</span><span class="p">::</span><span class="nf">zip</span><span class="p">(</span><span class="n">baz</span><span class="p">,</span> <span class="n">cap</span><span class="p">)</span>
<span class="p">);</span>
</code></pre></div></div>

<p>You run into the some problems too, like:</p>

<ul>
  <li>You can only run a fixed number of futures at once. If you might run a variable number of futures, you’re out of luck.</li>
  <li>What if you want to cancel one of the futures halfway through?</li>
  <li>Each future is polled every time <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.zip.html"><code class="language-plaintext highlighter-rouge">zip</code></a> is woken up. This means polling <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.zip.html"><code class="language-plaintext highlighter-rouge">zip</code></a> is an <code class="language-plaintext highlighter-rouge">O(n)</code> operation. This is sometimes known as the “thundering herd” problem.</li>
</ul>

<p>Let’s try to solve these problems. Without any prior art, I mean. We can solve
the “fixed number of futures” problem pretty easily. Consider the [<code class="language-plaintext highlighter-rouge">slab</code>] crate,
which lets us set up an indexed list of objects. It’s similar to an arena. We can fit our futures in there.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>cargo add slab
    Updating crates.io index
      Adding slab v0.4.9 to dependencies
             Features:
             + std
             - serde
    Updating crates.io index
</code></pre></div></div>

<p>Let’s also box the <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>s, so we can use multiple different implementors of <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> in our same <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.zip.html"><code class="language-plaintext highlighter-rouge">zip</code></a>.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">slab</span><span class="p">::</span><span class="n">Slab</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">future</span><span class="p">::</span><span class="n">Future</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">pin</span><span class="p">::</span><span class="nb">Pin</span><span class="p">;</span>

<span class="k">struct</span> <span class="n">GiantZip</span> <span class="p">{</span>
    <span class="c1">// Completed futures are represented by `None`.</span>
    <span class="n">futures</span><span class="p">:</span> <span class="n">Slab</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="nb">Pin</span><span class="o">&lt;</span><span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span> <span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span> <span class="o">=</span> <span class="p">()</span><span class="o">&gt;&gt;&gt;&gt;&gt;</span>
<span class="p">}</span>

<span class="k">impl</span> <span class="n">GiantZip</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="nf">new</span><span class="p">()</span> <span class="k">-&gt;</span> <span class="k">Self</span> <span class="p">{</span>
        <span class="k">Self</span> <span class="p">{</span>
            <span class="n">futures</span><span class="p">:</span> <span class="nn">Slab</span><span class="p">::</span><span class="nf">new</span><span class="p">()</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Let’s have an <code class="language-plaintext highlighter-rouge">insert</code> method that can be used to add a new <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> to this
new <a href="https://docs.rs/futures-lite/latest/futures_lite/future/fn.zip.html"><code class="language-plaintext highlighter-rouge">zip</code></a>-equivalent. It will return a key that can be used to look up the
future in our list.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">impl</span> <span class="n">GiantZip</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="n">insert</span><span class="o">&lt;</span><span class="n">F</span><span class="p">:</span> <span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span> <span class="o">=</span> <span class="p">()</span><span class="o">&gt;</span> <span class="o">+</span> <span class="k">'static</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">future</span><span class="p">:</span> <span class="n">F</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nb">usize</span> <span class="p">{</span>
        <span class="k">self</span><span class="py">.futures</span><span class="nf">.insert</span><span class="p">(</span><span class="nf">Some</span><span class="p">(</span><span class="nn">Box</span><span class="p">::</span><span class="nf">pin</span><span class="p">(</span><span class="n">future</span><span class="p">)))</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<blockquote>
  <p><img src="/images/ddog.jpg" alt="Dependency Dog" width="100" />
<strong>Dependency Dog:</strong> The <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> needs to be <code class="language-plaintext highlighter-rouge">'static</code> because it’s being boxed and pinned on the heap without a lifetime. It’s possible to work around this by adding a lifetime to <code class="language-plaintext highlighter-rouge">GiantZip</code> here, but let’s keep it simple for now.</p>
</blockquote>

<p>Finally, let’s make it so <code class="language-plaintext highlighter-rouge">poll</code>ing the <code class="language-plaintext highlighter-rouge">GiantZip</code> tries to resolve every one
of the futures contained within.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">impl</span> <span class="n">Future</span> <span class="k">for</span> <span class="n">GiantZip</span> <span class="p">{</span>
    <span class="k">type</span> <span class="n">Output</span> <span class="o">=</span> <span class="p">();</span>

    <span class="k">fn</span> <span class="nf">poll</span><span class="p">(</span><span class="k">mut</span> <span class="k">self</span><span class="p">:</span> <span class="nb">Pin</span><span class="o">&lt;&amp;</span><span class="k">mut</span> <span class="k">Self</span><span class="o">&gt;</span><span class="p">,</span> <span class="n">cx</span><span class="p">:</span> <span class="o">&amp;</span><span class="k">mut</span> <span class="n">Context</span><span class="o">&lt;</span><span class="nv">'_</span><span class="o">&gt;</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="n">Poll</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span> <span class="p">{</span>
        <span class="k">let</span> <span class="k">mut</span> <span class="n">unfinished</span> <span class="o">=</span> <span class="k">false</span><span class="p">;</span>

        <span class="k">for</span> <span class="p">(</span><span class="n">_</span><span class="p">,</span> <span class="n">future_slot</span><span class="p">)</span> <span class="k">in</span> <span class="k">self</span><span class="py">.futures</span><span class="nf">.iter_mut</span><span class="p">()</span> <span class="p">{</span>
            <span class="k">if</span> <span class="k">let</span> <span class="nf">Some</span><span class="p">(</span><span class="n">future</span><span class="p">)</span> <span class="o">=</span> <span class="n">future_slot</span><span class="nf">.as_mut</span><span class="p">()</span> <span class="p">{</span>
                <span class="c1">// Try to poll this future.</span>
                <span class="k">match</span> <span class="n">future</span><span class="nf">.as_mut</span><span class="p">()</span><span class="nf">.poll</span><span class="p">(</span><span class="n">cx</span><span class="p">)</span> <span class="p">{</span>
                    <span class="nn">Poll</span><span class="p">::</span><span class="nf">Ready</span><span class="p">(())</span> <span class="k">=&gt;</span> <span class="p">{</span>
                        <span class="c1">// Set the future to `None`.</span>
                        <span class="o">*</span><span class="n">future_slot</span> <span class="o">=</span> <span class="nb">None</span><span class="p">;</span>
                    <span class="p">},</span>

                    <span class="nn">Poll</span><span class="p">::</span><span class="n">Pending</span> <span class="k">=&gt;</span> <span class="p">{</span>
                        <span class="c1">// We are unfinished; return Pending.</span>
                        <span class="n">unfinished</span> <span class="o">=</span> <span class="k">true</span><span class="p">;</span> 
                    <span class="p">}</span>
                <span class="p">}</span>
            <span class="p">}</span>
        <span class="p">}</span>

        <span class="k">if</span> <span class="n">unfinished</span> <span class="p">{</span>
            <span class="nn">Poll</span><span class="p">::</span><span class="n">Pending</span>
        <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
            <span class="nn">Poll</span><span class="p">::</span><span class="nf">Ready</span><span class="p">(())</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Finally, we can test this out on futures that actually do something.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>cargo add async-channel
    Updating crates.io index
      Adding async-channel v2.2.0 to dependencies
             Features:
             + std
    Updating crates.io index
</code></pre></div></div>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Create a channel with a capacity of 1.</span>
<span class="k">let</span> <span class="p">(</span><span class="n">sender</span><span class="p">,</span> <span class="n">recv</span><span class="p">)</span> <span class="o">=</span> <span class="nn">async_channel</span><span class="p">::</span><span class="nf">bounded</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>

<span class="c1">// This is basically an `async fn` that sends a number over the channel.</span>
<span class="k">let</span> <span class="n">our_future</span> <span class="o">=</span> <span class="p">|</span><span class="n">i</span><span class="p">:</span> <span class="nb">i32</span><span class="p">|</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">sender</span> <span class="o">=</span> <span class="n">sender</span><span class="nf">.clone</span><span class="p">();</span>
    <span class="k">async</span> <span class="k">move</span> <span class="p">{</span> <span class="n">sender</span><span class="nf">.send</span><span class="p">(</span><span class="n">i</span><span class="p">)</span><span class="k">.await</span><span class="nf">.ok</span><span class="p">();</span> <span class="p">}</span>
<span class="p">};</span>

<span class="c1">// Create a future that reads from the channel.</span>
<span class="k">let</span> <span class="n">reader</span> <span class="o">=</span> <span class="k">async</span> <span class="k">move</span> <span class="p">{</span>
    <span class="k">for</span> <span class="n">_</span> <span class="k">in</span> <span class="mi">0</span><span class="o">..</span><span class="mi">3</span> <span class="p">{</span>
        <span class="nd">println!</span><span class="p">(</span><span class="s">"{}"</span><span class="p">,</span> <span class="n">recv</span><span class="nf">.recv</span><span class="p">()</span><span class="k">.await</span><span class="nf">.unwrap</span><span class="p">());</span>
    <span class="p">}</span>
<span class="p">};</span>

<span class="c1">// Use the GiantZip to poll all of these at once.</span>
<span class="k">let</span> <span class="k">mut</span> <span class="n">zipper</span> <span class="o">=</span> <span class="nn">GiantZip</span><span class="p">::</span><span class="nf">new</span><span class="p">();</span>
<span class="n">zipper</span><span class="nf">.insert</span><span class="p">(</span><span class="nf">our_future</span><span class="p">(</span><span class="mi">1</span><span class="p">));</span>
<span class="n">zipper</span><span class="nf">.insert</span><span class="p">(</span><span class="nf">our_future</span><span class="p">(</span><span class="mi">2</span><span class="p">));</span>
<span class="n">zipper</span><span class="nf">.insert</span><span class="p">(</span><span class="nf">our_future</span><span class="p">(</span><span class="mi">3</span><span class="p">));</span>
<span class="n">zipper</span><span class="nf">.insert</span><span class="p">(</span><span class="n">reader</span><span class="p">);</span>

<span class="c1">// Wait for them to finish.</span>
<span class="nn">future</span><span class="p">::</span><span class="nf">block_on</span><span class="p">(</span><span class="n">zipper</span><span class="p">);</span>
</code></pre></div></div>

<p>When we run it, we see this:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">time </span>cargo run <span class="nt">-q</span>
1
2
3
cargo run <span class="nt">-q</span>  0.03s user 0.03s system 93% cpu 0.064 total
</code></pre></div></div>

<p>That’s pretty fast, but that’s only because we have a low number of futures. If
we have 10,000,000 futures (not an unrealistic number for a web server!), it will
run much slower.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">time </span>cargo run <span class="nt">-q</span>
0
1
2
cargo run <span class="nt">-q</span>  9.16s user 0.70s system 100% cpu 9.803 total
</code></pre></div></div>

<blockquote>
  <p><img src="/images/notgull.png" alt="Notgull" width="100" />
<strong>notgull:</strong> It’s hard to express in the textual format, but each line had a few seconds’ delay between each of them. So it’s taking a while to get to the future that actually prints the line.</p>
</blockquote>

<p>In addition to being inefficient, we’ve also stumbled upon another issue: <code class="language-plaintext highlighter-rouge">GiantZip</code> is <em>unfair</em>. The <code class="language-plaintext highlighter-rouge">reader</code> <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> is at the very end of the <code class="language-plaintext highlighter-rouge">futures</code> list, which means it’s processed last when polling. Since a lot of the futures
end up being blocked on <code class="language-plaintext highlighter-rouge">reader</code>, it means polling the <code class="language-plaintext highlighter-rouge">GiantZip</code> takes a lot longer than it normally should.</p>

<p>Thankfully, we can solve the <code class="language-plaintext highlighter-rouge">O(n)</code> problem and also (kind of) solve the fairness problem in one fell swoop. Instead of polling <em>every</em> future every time we poll,
we should only poll the ones whose <a href="https://doc.rust-lang.org/stable/std/task/struct.Waker.html"><code class="language-plaintext highlighter-rouge">Waker</code></a>s have been woken up. Since we know those ones are ready, we should only poll those.</p>

<p>Let’s add a queue structure to the <code class="language-plaintext highlighter-rouge">GiantZip</code> that contains the indexes of the
futures that are ready to be woken. I’m wrapping it in an <a href="https://doc.rust-lang.org/stable/std/sync/struct.Arc.html"><code class="language-plaintext highlighter-rouge">Arc</code></a> and a <a href="https://doc.rust-lang.org/stable/std/sync/struct.Mutex.html"><code class="language-plaintext highlighter-rouge">Mutex</code></a>
for reasons that will become obvious later.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">collections</span><span class="p">::</span><span class="n">VecDeque</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">sync</span><span class="p">::{</span><span class="nb">Arc</span><span class="p">,</span> <span class="n">Mutex</span><span class="p">};</span>

<span class="k">struct</span> <span class="n">GiantZip</span> <span class="p">{</span>
    <span class="c1">// Completed futures are represented by `None`.</span>
    <span class="n">futures</span><span class="p">:</span> <span class="n">Slab</span><span class="o">&lt;</span><span class="nb">Option</span><span class="o">&lt;</span><span class="nb">Pin</span><span class="o">&lt;</span><span class="nb">Box</span><span class="o">&lt;</span><span class="k">dyn</span> <span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span> <span class="o">=</span> <span class="p">()</span><span class="o">&gt;&gt;&gt;&gt;&gt;</span><span class="p">,</span>

    <span class="c1">// NEW: Queue of futures that are waiting to be woken up.</span>
    <span class="n">queue</span><span class="p">:</span> <span class="nb">Arc</span><span class="o">&lt;</span><span class="n">Mutex</span><span class="o">&lt;</span><span class="n">VecDeque</span><span class="o">&lt;</span><span class="nb">usize</span><span class="o">&gt;&gt;&gt;</span><span class="p">,</span>
<span class="p">}</span>

<span class="k">impl</span> <span class="n">GiantZip</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="nf">new</span><span class="p">()</span> <span class="k">-&gt;</span> <span class="k">Self</span> <span class="p">{</span>
        <span class="k">Self</span> <span class="p">{</span>
            <span class="n">futures</span><span class="p">:</span> <span class="nn">Slab</span><span class="p">::</span><span class="nf">new</span><span class="p">(),</span>
            <span class="n">queue</span><span class="p">:</span> <span class="nn">Arc</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">Mutex</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="nn">VecDeque</span><span class="p">::</span><span class="nf">new</span><span class="p">()))</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Now, when we first <code class="language-plaintext highlighter-rouge">insert</code> the <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> into the <code class="language-plaintext highlighter-rouge">GiantZip</code>, we have to mark
it as ready. This is done by just pushing the index of the future into the queue.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">impl</span> <span class="n">GiantZip</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="n">insert</span><span class="o">&lt;</span><span class="n">F</span><span class="p">:</span> <span class="n">Future</span><span class="o">&lt;</span><span class="n">Output</span> <span class="o">=</span> <span class="p">()</span><span class="o">&gt;</span> <span class="o">+</span> <span class="k">'static</span><span class="o">&gt;</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">future</span><span class="p">:</span> <span class="n">F</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nb">usize</span> <span class="p">{</span>
        <span class="c1">// NEW: Save the index and push it to the back of the queue before returning.</span>
        <span class="k">let</span> <span class="n">index</span> <span class="o">=</span> <span class="k">self</span><span class="py">.futures</span><span class="nf">.insert</span><span class="p">(</span><span class="nf">Some</span><span class="p">(</span><span class="nn">Box</span><span class="p">::</span><span class="nf">pin</span><span class="p">(</span><span class="n">future</span><span class="p">)));</span>
        <span class="k">self</span><span class="py">.queue</span><span class="nf">.lock</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">()</span><span class="nf">.push_back</span><span class="p">(</span><span class="n">index</span><span class="p">);</span>
        <span class="n">index</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>We also need to have a way for the <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> to mark itself as ready. The <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a>
calls the <a href="https://doc.rust-lang.org/stable/std/task/struct.Waker.html"><code class="language-plaintext highlighter-rouge">Waker</code></a> when it is ready to be woken up, so we can just create a <a href="https://doc.rust-lang.org/stable/std/task/struct.Waker.html"><code class="language-plaintext highlighter-rouge">Waker</code></a> that wraps around the top-level <a href="https://doc.rust-lang.org/stable/std/task/struct.Waker.html"><code class="language-plaintext highlighter-rouge">Waker</code></a>, but also marks the current
future as ready.</p>

<p>We’ll bring in <a href="https://crates.io/crates/waker-fn"><code class="language-plaintext highlighter-rouge">waker-fn</code></a> to make this easier. A <a href="https://doc.rust-lang.org/stable/std/task/struct.Waker.html"><code class="language-plaintext highlighter-rouge">Waker</code></a> is just a glorified callback, so we can easily represent it as one.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>cargo add waker-fn
    Updating crates.io index
      Adding waker-fn v1.1.1 to dependencies
    Updating crates.io index
</code></pre></div></div>

<p>Let’s make creating the <a href="https://doc.rust-lang.org/stable/std/task/struct.Waker.html"><code class="language-plaintext highlighter-rouge">Waker</code></a> a helper function on <code class="language-plaintext highlighter-rouge">GiantZip</code>, to keep things
clean.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">use</span> <span class="nn">waker_fn</span><span class="p">::</span><span class="n">waker_fn</span><span class="p">;</span>
<span class="k">use</span> <span class="nn">std</span><span class="p">::</span><span class="nn">task</span><span class="p">::</span><span class="n">Waker</span><span class="p">;</span>

<span class="k">impl</span> <span class="n">GiantZip</span> <span class="p">{</span>
    <span class="cd">/// Create a waker that wakes the future in the provided slot.</span>
    <span class="k">fn</span> <span class="nf">waker_for_slot</span><span class="p">(</span><span class="o">&amp;</span><span class="k">self</span><span class="p">,</span> <span class="n">index</span><span class="p">:</span> <span class="nb">usize</span><span class="p">,</span> <span class="n">toplevel</span><span class="p">:</span> <span class="o">&amp;</span><span class="n">Waker</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="n">Waker</span> <span class="p">{</span>
        <span class="c1">// Clone shared resources.</span>
        <span class="c1">// *This* is why we made `queue` wrapped in an `Arc`, by the way.</span>
        <span class="k">let</span> <span class="n">queue</span> <span class="o">=</span> <span class="k">self</span><span class="py">.queue</span><span class="nf">.clone</span><span class="p">();</span>
        <span class="k">let</span> <span class="n">toplevel</span> <span class="o">=</span> <span class="n">toplevel</span><span class="nf">.clone</span><span class="p">();</span>

        <span class="c1">// Create a waker.</span>
        <span class="nf">waker_fn</span><span class="p">(</span><span class="k">move</span> <span class="p">||</span> <span class="p">{</span>
            <span class="c1">// Mark the future as ready.</span>
            <span class="n">queue</span><span class="nf">.lock</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">()</span><span class="nf">.push_back</span><span class="p">(</span><span class="n">index</span><span class="p">);</span>

            <span class="c1">// Wake the toplevel `block_on` waker, so the GiantZip poll()</span>
            <span class="c1">// implementation is ran again.</span>
            <span class="n">toplevel</span><span class="nf">.wake_by_ref</span><span class="p">();</span>
        <span class="p">})</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Finally, we can adjust the <code class="language-plaintext highlighter-rouge">poll()</code> implementation for <code class="language-plaintext highlighter-rouge">GiantZip</code> such that it
pops from the queue instead of polling each and every future in the list.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">impl</span> <span class="n">GiantZip</span> <span class="p">{</span>
    <span class="cd">/// Get the next index in the list.</span>
    <span class="k">fn</span> <span class="nf">next_index</span><span class="p">(</span><span class="o">&amp;</span><span class="k">self</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nb">Option</span><span class="o">&lt;</span><span class="nb">usize</span><span class="o">&gt;</span> <span class="p">{</span>
        <span class="k">self</span><span class="py">.queue</span><span class="nf">.lock</span><span class="p">()</span><span class="nf">.unwrap</span><span class="p">()</span><span class="nf">.pop_front</span><span class="p">()</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="k">impl</span> <span class="n">Future</span> <span class="k">for</span> <span class="n">GiantZip</span> <span class="p">{</span>
    <span class="k">type</span> <span class="n">Output</span> <span class="o">=</span> <span class="p">();</span>

    <span class="k">fn</span> <span class="nf">poll</span><span class="p">(</span><span class="k">self</span><span class="p">:</span> <span class="nb">Pin</span><span class="o">&lt;&amp;</span><span class="k">mut</span> <span class="k">Self</span><span class="o">&gt;</span><span class="p">,</span> <span class="n">cx</span><span class="p">:</span> <span class="o">&amp;</span><span class="k">mut</span> <span class="n">Context</span><span class="o">&lt;</span><span class="nv">'_</span><span class="o">&gt;</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="n">Poll</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span> <span class="p">{</span>
        <span class="c1">// Get around Rust's pinning rules.</span>
        <span class="k">let</span> <span class="n">this</span> <span class="o">=</span> <span class="k">self</span><span class="nf">.get_mut</span><span class="p">();</span>

        <span class="c1">// NEW: We drain the "queue" instead of iterating over every future.</span>
        <span class="c1">// Make sure not to hold the lock while polling; if a future is woken by another future,</span>
        <span class="c1">// it would deadlock otherwise.</span>
        <span class="k">while</span> <span class="k">let</span> <span class="nf">Some</span><span class="p">(</span><span class="n">index</span><span class="p">)</span> <span class="o">=</span> <span class="n">this</span><span class="nf">.next_index</span><span class="p">()</span> <span class="p">{</span>
            <span class="c1">// NEW: Create a waker to poll this future with.</span>
            <span class="k">let</span> <span class="n">waker</span> <span class="o">=</span> <span class="n">this</span><span class="nf">.waker_for_slot</span><span class="p">(</span><span class="n">index</span><span class="p">,</span> <span class="n">cx</span><span class="nf">.waker</span><span class="p">());</span>
            <span class="k">let</span> <span class="k">mut</span> <span class="n">slot_context</span> <span class="o">=</span> <span class="nn">Context</span><span class="p">::</span><span class="nf">from_waker</span><span class="p">(</span><span class="o">&amp;</span><span class="n">waker</span><span class="p">);</span>

            <span class="k">let</span> <span class="n">future_slot</span> <span class="o">=</span> <span class="k">match</span> <span class="n">this</span><span class="py">.futures</span><span class="nf">.get_mut</span><span class="p">(</span><span class="n">index</span><span class="p">)</span> <span class="p">{</span>
                <span class="nf">Some</span><span class="p">(</span><span class="n">slot</span><span class="p">)</span> <span class="k">=&gt;</span> <span class="n">slot</span><span class="p">,</span>
                <span class="nb">None</span> <span class="k">=&gt;</span> <span class="k">continue</span>
            <span class="p">};</span>
            <span class="k">if</span> <span class="k">let</span> <span class="nf">Some</span><span class="p">(</span><span class="n">future</span><span class="p">)</span> <span class="o">=</span> <span class="n">future_slot</span><span class="nf">.as_mut</span><span class="p">()</span> <span class="p">{</span>
                <span class="c1">// Try to poll this future.</span>
                <span class="k">match</span> <span class="n">future</span><span class="nf">.as_mut</span><span class="p">()</span><span class="nf">.poll</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="n">slot_context</span><span class="p">)</span> <span class="p">{</span>
                    <span class="nn">Poll</span><span class="p">::</span><span class="nf">Ready</span><span class="p">(())</span> <span class="k">=&gt;</span> <span class="p">{</span>
                        <span class="c1">// Set the future to `None`.</span>
                        <span class="o">*</span><span class="n">future_slot</span> <span class="o">=</span> <span class="nb">None</span><span class="p">;</span>
                    <span class="p">},</span>

                    <span class="nn">Poll</span><span class="p">::</span><span class="n">Pending</span> <span class="k">=&gt;</span> <span class="p">{}</span>
                <span class="p">}</span>
            <span class="p">}</span>
        <span class="p">}</span>

        <span class="k">if</span> <span class="n">this</span><span class="py">.futures</span><span class="nf">.iter</span><span class="p">()</span><span class="nf">.any</span><span class="p">(|(</span><span class="n">_</span><span class="p">,</span> <span class="n">fut</span><span class="p">)|</span> <span class="n">fut</span><span class="nf">.is_some</span><span class="p">())</span> <span class="p">{</span>
            <span class="nn">Poll</span><span class="p">::</span><span class="n">Pending</span>
        <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
            <span class="nn">Poll</span><span class="p">::</span><span class="nf">Ready</span><span class="p">(())</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>When we run the program, we see the following output:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span><span class="nb">time </span>cargo run <span class="nt">-q</span>
0
1
2
cargo run <span class="nt">-q</span>  5.03s user 0.62s system 101% cpu 5.583 total
</code></pre></div></div>

<p>There is still quite a bit of contention caused by the initial burst of futures
as well as the last burst of futures. But, there is no more delay between the
printing of the numbers. This indicates that the runtime is being used more
efficiently.</p>

<p>This also solves our cancellation problem; we can just add a <code class="language-plaintext highlighter-rouge">remove</code> method to
the <code class="language-plaintext highlighter-rouge">GiantZip</code> to remove a keyed future from the list.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">impl</span> <span class="n">GiantZip</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="nf">remove</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">index</span><span class="p">:</span> <span class="nb">usize</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">self</span><span class="py">.futures</span><span class="nf">.remove</span><span class="p">(</span><span class="n">index</span><span class="p">);</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="k">let</span> <span class="n">key</span> <span class="o">=</span> <span class="n">zipper</span><span class="nf">.insert</span><span class="p">(</span><span class="k">async</span> <span class="p">{</span> <span class="nd">panic!</span><span class="p">()</span> <span class="p">});</span>
<span class="c1">// Actually, might not be the best idea to run that task.</span>
<span class="n">zipper</span><span class="nf">.remove</span><span class="p">(</span><span class="n">key</span><span class="p">);</span>
</code></pre></div></div>

<p>Now we’ve solved all of our problems… and introduced a million new ones.</p>

<h2 id="persistent-problems">Persistent Problems</h2>

<p>I’ve deliberately made some mistakes in the above example, in order to illustrate how fixing those mistakes can lead to a very important data pattern. So let’s discuss those mistakes.</p>

<blockquote>
  <p><img src="/images/ddog.jpg" alt="Dependency Dog" width="100" />
<strong>Dependency Dog:</strong> “Deliberately”, you say?</p>
</blockquote>

<blockquote>
  <p><img src="/images/notgull.png" alt="Notgull" width="100" />
<strong>notgull:</strong> Hey, I’ll have you know, I’m just following the natural evolution of the <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> pattern.</p>
</blockquote>

<blockquote>
  <p><img src="/images/ddog.jpg" alt="Dependency Dog" width="100" />
<strong>Dependency Dog:</strong> Did the “natural evolution” force you to re-allocate a new <a href="https://doc.rust-lang.org/stable/std/task/struct.Waker.html"><code class="language-plaintext highlighter-rouge">Waker</code></a> every time you polled a future?</p>
</blockquote>

<p>The first issue is that all of this is <em>very</em> inefficient. Ignoring our suboptimal queueing structure, we have three main allocations here:</p>

<ul>
  <li>We have a <code class="language-plaintext highlighter-rouge">Vec</code> to hold our futures inside of.</li>
  <li>Each individual <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> requires its own <code class="language-plaintext highlighter-rouge">Box</code>.</li>
  <li>Every time the <code class="language-plaintext highlighter-rouge">GiantZip</code> is polled we have to create a <a href="https://doc.rust-lang.org/stable/std/task/struct.Waker.html"><code class="language-plaintext highlighter-rouge">Waker</code></a> to poll it with. The [<code class="language-plaintext highlighter-rouge">waker_fn</code>] crate allocates this inside of an <a href="https://doc.rust-lang.org/stable/std/sync/struct.Arc.html"><code class="language-plaintext highlighter-rouge">Arc</code></a>.</li>
</ul>

<p>Specifically we should be concerned about the <a href="https://doc.rust-lang.org/stable/std/task/struct.Waker.html"><code class="language-plaintext highlighter-rouge">Waker</code></a> allocation, since it occurs on the hot path. We should try our best to make sure that we can create a <a href="https://doc.rust-lang.org/stable/std/task/struct.Waker.html"><code class="language-plaintext highlighter-rouge">Waker</code></a> without allocating.</p>

<p>There are some other persistent problems, like:</p>

<ul>
  <li>It’s very easy to misuse the API. <code class="language-plaintext highlighter-rouge">usize</code> isn’t a great type to index by, for a collection of hotly-polled futures.</li>
  <li>You can’t get the result of a <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> after it completes… or use a <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> that returns anything other than <code class="language-plaintext highlighter-rouge">()</code>, for that matter.</li>
  <li>It is very difficult (albeit not impossible) to remove a <a href="https://doc.rust-lang.org/std/future/trait.Future.html"><code class="language-plaintext highlighter-rouge">Future</code></a> from the list while it is running.</li>
</ul>

<p>We’ll begin to address these problems in the next blog post, when we start to build a real task abstraction.</p>]]></content><author><name>John Nunley</name></author><category term="async" /><category term="rust" /><category term="smol" /><category term="taskexplained" /><summary type="html"><![CDATA[async-task is one of the most complicated crates in the smol ecosystem. But, fundamentally, it’s just a future on the heap.]]></summary></entry><entry><title type="html">Why choose async/await over threads?</title><link href="http://notgull.net/why-not-threads/" rel="alternate" type="text/html" title="Why choose async/await over threads?" /><published>2024-03-24T00:00:00+00:00</published><updated>2024-03-24T00:00:00+00:00</updated><id>http://notgull.net/why-not-threads</id><content type="html" xml:base="http://notgull.net/why-not-threads/"><![CDATA[<p>A common refrain is that threads can do everything that <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> can, but
simpler. So why would anyone choose <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code>?</p>

<p>This is a common question that I’ve seen a lot in the Rust community. Frankly, I completely understand
where it’s coming from.</p>

<p>Rust is a low-level language that doesn’t hide the
complexity of coroutines from you. This is in opposition to languages like Go,
where <code class="language-plaintext highlighter-rouge">async</code> happens by default, without the programmer needing to even
consider it.</p>

<p>Smart programmers try to avoid complexity. So, they see the extra complexity in
<code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> and question why it is needed. This question is especially
pertinent when considering that a reasonable alternative exists in OS threads.</p>

<p>Let’s take a mind-journey through <code class="language-plaintext highlighter-rouge">async</code> and see how it stacks up.</p>

<h2 id="background-blitz">Background Blitz</h2>

<p>Rust is a low-level language. Normally, code is linear; one thing runs after
another. It looks like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="p">{</span>
    <span class="nf">foo</span><span class="p">();</span>
    <span class="nf">bar</span><span class="p">();</span>
    <span class="nf">baz</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Nice and simple, right?</p>

<p>However, sometimes you will want to run many things at once. The canonical
example for this is a web server. Consider the following written in linear
code:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="k">-&gt;</span> <span class="nn">io</span><span class="p">::</span><span class="nb">Result</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">socket</span> <span class="o">=</span> <span class="nn">TcpListener</span><span class="p">::</span><span class="nf">bind</span><span class="p">(</span><span class="s">"0.0.0.0:80"</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>

    <span class="k">loop</span> <span class="p">{</span>
        <span class="k">let</span> <span class="p">(</span><span class="n">client</span><span class="p">,</span> <span class="n">_</span><span class="p">)</span> <span class="o">=</span> <span class="n">socket</span><span class="nf">.accept</span><span class="p">()</span><span class="o">?</span><span class="p">;</span>
        <span class="nf">handle_client</span><span class="p">(</span><span class="n">client</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Imagine if <code class="language-plaintext highlighter-rouge">handle_client</code> takes a few milliseconds, and two clients try to
connect to your webserver at the same time. You’ll run into a serious
problem!</p>

<ul>
  <li>Client #1 connects to the webserver, and is accepted by the <code class="language-plaintext highlighter-rouge">accept()</code>
function. It starts running <code class="language-plaintext highlighter-rouge">handle_client()</code>.</li>
  <li>Client #2 connects to the webserver. However, since <code class="language-plaintext highlighter-rouge">accept()</code> is not
currently running, we have to wait for <code class="language-plaintext highlighter-rouge">handle_client()</code> for Client #1 to
finish running.</li>
  <li>After waiting a few milliseconds, we get back to <code class="language-plaintext highlighter-rouge">accept()</code>. Client #2 can
connect.</li>
</ul>

<p>Now imagine that instead of two clients, there are two million simultaneous
clients. At the end of the queue, you’ll have to wait several minutes
before the web server can help you. It becomes un-scalable very quickly.</p>

<p>Obviously, the embryonic web tried to solve this problem. The original solution was to introduce threading. By saving
the value of some registers and the program’s stack into memory, the operating
system can stop a program, run another program in its place, then resume running
that program later. Essentially, it allows for multiple routines (or “threads”,
or “processes”) to run on the same CPU.</p>

<p>Using threads, we can rewrite the above code as follows:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="nf">main</span><span class="p">()</span> <span class="k">-&gt;</span> <span class="nn">io</span><span class="p">::</span><span class="nb">Result</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">socket</span> <span class="o">=</span> <span class="nn">TcpListener</span><span class="p">::</span><span class="nf">bind</span><span class="p">(</span><span class="s">"0.0.0.0:80"</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>

    <span class="k">loop</span> <span class="p">{</span>
        <span class="k">let</span> <span class="p">(</span><span class="n">client</span><span class="p">,</span> <span class="n">_</span><span class="p">)</span> <span class="o">=</span> <span class="n">socket</span><span class="nf">.accept</span><span class="p">()</span><span class="o">?</span><span class="p">;</span>
        <span class="nn">thread</span><span class="p">::</span><span class="nf">spawn</span><span class="p">(</span><span class="k">move</span> <span class="p">||</span> <span class="nf">handle_client</span><span class="p">(</span><span class="n">client</span><span class="p">));</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Now, the client is being handled by a separate thread than the one handling
waiting for new connections. Great! This avoids the problem by allowing concurrent
thread access.</p>

<ul>
  <li>Client #1 is <code class="language-plaintext highlighter-rouge">accept</code>ed by the server. The server spawns a thread that calls
<code class="language-plaintext highlighter-rouge">handle_client</code>.</li>
  <li>Client #2 tries to connect to the server.</li>
  <li>Eventually, <code class="language-plaintext highlighter-rouge">handle_client</code> blocks on something. The OS saves the thread
handling Client #1 and brings back the main thread.</li>
  <li>The main thread <code class="language-plaintext highlighter-rouge">accept</code>s Client #2. It spawns a separate thread to handle
Client #2. With only a few microseconds of delay, Client #1 and Client #2
are run in parallel.</li>
</ul>

<p>Threads work especially well when you consider that production-grade web servers
have dozens of CPU cores. It’s not just that the OS can give the <em>illusion</em> that
all of these threads run at the same time; it’s that the OS can <em>actually</em> make
them all run at once.</p>

<p>Eventually, for reasons I’ll elaborate later, programmers wanted to bring this
concurrency out of the OS space and into the user space. There are many
different models for userspace concurrency. There is event-driven programming,
actors, and coroutines. The one Rust settled on is <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code>.</p>

<p>To oversimplify, you compile the program as a grab-bag of state machines that
can all be run independently of another. Rust itself provides a mechanism for
creating state machines; the mechanism of <code class="language-plaintext highlighter-rouge">async</code> and <code class="language-plaintext highlighter-rouge">await</code>. The above program in terms of <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> would
look like this, written using <a href="https://crates.io/crates/smol"><code class="language-plaintext highlighter-rouge">smol</code></a>:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nd">#[apply(smol_macros::main</span><span class="err">!</span><span class="nd">)]</span>
<span class="k">async</span> <span class="k">fn</span> <span class="nf">main</span><span class="p">(</span><span class="n">ex</span><span class="p">:</span> <span class="o">&amp;</span><span class="nn">smol</span><span class="p">::</span><span class="n">Executor</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nn">io</span><span class="p">::</span><span class="nb">Result</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">let</span> <span class="n">socket</span> <span class="o">=</span> <span class="nn">TcpListener</span><span class="p">::</span><span class="nf">bind</span><span class="p">(</span><span class="s">"0.0.0.0:80"</span><span class="p">)</span><span class="k">.await</span><span class="o">?</span><span class="p">;</span>

    <span class="k">loop</span> <span class="p">{</span>
        <span class="k">let</span> <span class="p">(</span><span class="n">client</span><span class="p">,</span> <span class="n">_</span><span class="p">)</span> <span class="o">=</span> <span class="n">socket</span><span class="nf">.accept</span><span class="p">()</span><span class="k">.await</span><span class="o">?</span><span class="p">;</span>
        <span class="n">ex</span><span class="nf">.spawn</span><span class="p">(</span><span class="k">async</span> <span class="k">move</span> <span class="p">{</span>
            <span class="nf">handle_client</span><span class="p">(</span><span class="n">client</span><span class="p">)</span><span class="k">.await</span><span class="p">;</span>
        <span class="p">})</span><span class="nf">.detach</span><span class="p">();</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<ul>
  <li>The main function is preceded with the <code class="language-plaintext highlighter-rouge">async</code> keyword. This means that it is
not a traditional function, but one that returns a state machine. Roughly, the
function’s contents correspond to that state machine.</li>
  <li><code class="language-plaintext highlighter-rouge">await</code> includes another state machine as a part of the currently running
state machine. For <code class="language-plaintext highlighter-rouge">accept()</code>, it means that the state machine will include it
as a step.</li>
  <li>Eventually, one of the inner functions will <em>yield</em>, or give up control. For
example, when <code class="language-plaintext highlighter-rouge">accept()</code> waits
for a new connection. At this point the entire state machine will yield its execution to
the higher-level executor. For us, that is <code class="language-plaintext highlighter-rouge">smol::Executor</code>.</li>
  <li>Once execution is yielded, the <code class="language-plaintext highlighter-rouge">Executor</code> will replace the current state
machine with another one that is running concurrently, spawned through the
<code class="language-plaintext highlighter-rouge">spawn</code> function.</li>
  <li>We pass an <code class="language-plaintext highlighter-rouge">async</code> block to the <code class="language-plaintext highlighter-rouge">spawn</code> function. This block represents an entire new state
machine, independent of the one created by the <code class="language-plaintext highlighter-rouge">main</code> function. All this state
machine does is run the <code class="language-plaintext highlighter-rouge">handle_client</code> function.</li>
  <li>Once <code class="language-plaintext highlighter-rouge">main</code> yields, one of the clients is selected to run in its place. Once that
client yields, the cycle repeats.</li>
  <li>You can now handle millions of simultaneous clients.</li>
</ul>

<p>Of course, user-space concurrency like this introduces an uptick in
complexity. When you’re using threads, you don’t have to deal with executors
and tasks and state machines and all.</p>

<p>If you’re a reasonable person, you might be asking “why do we need to do all of
this? Threads work well; for 99% of programs, we don’t need to involve any kind
of user-space concurrency. Introducing new complexity is technical debt, and
technical debt costs us time and money.</p>

<p>“So why wouldn’t we use threads?”</p>

<h2 id="timeout-trouble">Timeout Trouble</h2>

<p>Perhaps one of Rust’s biggest strengths is <em>composability</em>. It provides a set of
abstractions that can be nested, built upon, put together, and expanded upon.</p>

<p>I recall that <em>the</em> thing that made me stick with Rust is the <a href="https://doc.rust-lang.org/std/iter/trait.Iterator.html"><code class="language-plaintext highlighter-rouge">Iterator</code></a>
trait. It blew my mind that you could make something an <a href="https://doc.rust-lang.org/std/iter/trait.Iterator.html"><code class="language-plaintext highlighter-rouge">Iterator</code></a>, apply
a handful of different combinators, then pass the resulting <a href="https://doc.rust-lang.org/std/iter/trait.Iterator.html"><code class="language-plaintext highlighter-rouge">Iterator</code></a> into
any function that took an <a href="https://doc.rust-lang.org/std/iter/trait.Iterator.html"><code class="language-plaintext highlighter-rouge">Iterator</code></a>.</p>

<p>It continues to impress me how powerful it is. Let’s say you want to receive a list of
integers from another thread, only take the ones that are immediately available,
discard any integers that aren’t even, add one to all of them, then push them
onto a new list.</p>

<p>That would be fifty lines and a helper function in some other languages. In Rust
it can be done in five:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="p">(</span><span class="n">send</span><span class="p">,</span> <span class="n">recv</span><span class="p">)</span> <span class="o">=</span> <span class="nn">mpsc</span><span class="p">::</span><span class="nf">channel</span><span class="p">();</span>
<span class="n">my_list</span><span class="nf">.extend</span><span class="p">(</span>
    <span class="n">recv</span><span class="nf">.try_iter</span><span class="p">()</span>
        <span class="nf">.filter</span><span class="p">(|</span><span class="n">x</span><span class="p">|</span> <span class="n">x</span> <span class="o">&amp;</span> <span class="mi">1</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
        <span class="nf">.map</span><span class="p">(|</span><span class="n">x</span><span class="p">|</span> <span class="n">x</span> <span class="o">+</span> <span class="mi">1</span><span class="p">)</span>
<span class="p">);</span>
</code></pre></div></div>

<p>The best thing about <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> is that it lets you apply this composability
to I/O-bound functions. Let’s say you have a new client requirement; you want to add a timeout to your above
function. Assume that our <code class="language-plaintext highlighter-rouge">handle_client</code> above function looks like this:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="k">fn</span> <span class="nf">handle_client</span><span class="p">(</span><span class="n">client</span><span class="p">:</span> <span class="n">TcpStream</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nn">io</span><span class="p">::</span><span class="nb">Result</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">let</span> <span class="k">mut</span> <span class="n">data</span> <span class="o">=</span> <span class="nd">vec!</span><span class="p">[];</span>
    <span class="n">client</span><span class="nf">.read_to_end</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="n">data</span><span class="p">)</span><span class="k">.await</span><span class="o">?</span><span class="p">;</span>
    
    <span class="k">let</span> <span class="n">response</span> <span class="o">=</span> <span class="nf">do_something_with_data</span><span class="p">(</span><span class="n">data</span><span class="p">)</span><span class="k">.await</span><span class="o">?</span>
    <span class="n">client</span><span class="nf">.write_all</span><span class="p">(</span><span class="o">&amp;</span><span class="n">response</span><span class="p">)</span><span class="k">.await</span><span class="o">?</span><span class="p">;</span>

    <span class="nf">Ok</span><span class="p">(())</span>
<span class="p">}</span>
</code></pre></div></div>

<p>If we want to add, say, a three-second timeout, we can combine two combinators
to do that:</p>

<ul>
  <li>The <a href="https://docs.rs/smol/latest/smol/future/fn.race.html"><code class="language-plaintext highlighter-rouge">race</code></a> function takes two futures and runs them at the same time.</li>
  <li>The <a href="https://docs.rs/smol/latest/smol/struct.Timer.html"><code class="language-plaintext highlighter-rouge">Timer</code></a> future waits for some time before returning.</li>
</ul>

<p>Here is what the final code looks like:</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">async</span> <span class="k">fn</span> <span class="nf">handle_client</span><span class="p">(</span><span class="n">client</span><span class="p">:</span> <span class="n">TcpStream</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nn">io</span><span class="p">::</span><span class="nb">Result</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="c1">// Future that handles the actual connection.</span>
    <span class="k">let</span> <span class="n">driver</span> <span class="o">=</span> <span class="k">async</span> <span class="k">move</span> <span class="p">{</span>
        <span class="k">let</span> <span class="k">mut</span> <span class="n">data</span> <span class="o">=</span> <span class="nd">vec!</span><span class="p">[];</span>
        <span class="n">client</span><span class="nf">.read_to_end</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="n">data</span><span class="p">)</span><span class="k">.await</span><span class="o">?</span><span class="p">;</span>
        
        <span class="k">let</span> <span class="n">response</span> <span class="o">=</span> <span class="nf">do_something_with_data</span><span class="p">(</span><span class="n">data</span><span class="p">)</span><span class="k">.await</span><span class="o">?</span>
        <span class="n">client</span><span class="nf">.write_all</span><span class="p">(</span><span class="o">&amp;</span><span class="n">response</span><span class="p">)</span><span class="k">.await</span><span class="o">?</span><span class="p">;</span>

        <span class="nf">Ok</span><span class="p">(())</span>
    <span class="p">};</span>

    <span class="c1">// Future that handles waiting for a timeout.</span>
    <span class="k">let</span> <span class="n">timeout</span> <span class="o">=</span> <span class="k">async</span> <span class="p">{</span>
        <span class="nn">Timer</span><span class="p">::</span><span class="nf">after</span><span class="p">(</span><span class="nn">Duration</span><span class="p">::</span><span class="nf">from_secs</span><span class="p">(</span><span class="mi">3</span><span class="p">))</span><span class="k">.await</span><span class="p">;</span>

        <span class="c1">// We just hit a timeout! Return an error.</span>
        <span class="nf">Err</span><span class="p">(</span><span class="nn">io</span><span class="p">::</span><span class="nn">ErrorKind</span><span class="p">::</span><span class="n">TimedOut</span><span class="nf">.into</span><span class="p">())</span>
    <span class="p">};</span>

    <span class="c1">// Run both in parallel.</span>
    <span class="n">driver</span><span class="nf">.race</span><span class="p">(</span><span class="n">timeout</span><span class="p">)</span><span class="k">.await</span>
<span class="p">}</span>
</code></pre></div></div>

<p>I find this to be a very easy process. All you have to do is wrap your existing
code in an <code class="language-plaintext highlighter-rouge">async</code> block and race it against another future.</p>

<p>An added bonus of this approach is that it works with any kind of stream. Here,
we use a <code class="language-plaintext highlighter-rouge">TcpStream</code>. However we can easily replace it with anything that
implements <code class="language-plaintext highlighter-rouge">impl AsyncRead + AsyncWrite</code>. It could be a GZIP stream on top of
the normal stream, or a Unix socket, or a file. <code class="language-plaintext highlighter-rouge">async</code> just slides into
whatever pattern you need from it.</p>

<h2 id="thematic-threads">Thematic Threads</h2>

<p>What if we wanted to implement this in our threaded example above?</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">fn</span> <span class="nf">handle_client</span><span class="p">(</span><span class="n">client</span><span class="p">:</span> <span class="n">TcpStream</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="nn">io</span><span class="p">::</span><span class="nb">Result</span><span class="o">&lt;</span><span class="p">()</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">let</span> <span class="k">mut</span> <span class="n">data</span> <span class="o">=</span> <span class="nd">vec!</span><span class="p">[];</span>
    <span class="n">client</span><span class="nf">.read_to_end</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="n">data</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>
    
    <span class="k">let</span> <span class="n">response</span> <span class="o">=</span> <span class="nf">do_something_with_data</span><span class="p">(</span><span class="n">data</span><span class="p">)</span><span class="o">?</span>
    <span class="n">client</span><span class="nf">.write_all</span><span class="p">(</span><span class="o">&amp;</span><span class="n">response</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>

    <span class="nf">Ok</span><span class="p">(())</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Well, it’s not easy. Generally, you can’t interrupt the <code class="language-plaintext highlighter-rouge">read</code> or <code class="language-plaintext highlighter-rouge">write</code>
system calls in blocking code, without doing something catastrophic like
closing the file descriptor (which can’t be done in Rust).</p>

<p>Thankfully, <a href="https://doc.rust-lang.org/std/net/struct.TcpStream.html"><code class="language-plaintext highlighter-rouge">TcpStream</code></a> has two functions <a href="https://doc.rust-lang.org/std/net/struct.TcpStream.html#method.set_read_timeout"><code class="language-plaintext highlighter-rouge">set_read_timeout</code></a> and
<a href="https://doc.rust-lang.org/std/net/struct.TcpStream.html#method.set_write_timeout"><code class="language-plaintext highlighter-rouge">set_write_timeout</code></a> that can be used to set the timeouts for reading and
writing, respectively. However, we can’t just use it naively. Imagine a client
that sends one byte every 2.9 seconds, just to reset the timeout.</p>

<p>So we have to program a little defensively here. Due to the power of Rust combinators, we can write our own
type wrapping around the <a href="https://doc.rust-lang.org/std/net/struct.TcpStream.html"><code class="language-plaintext highlighter-rouge">TcpStream</code></a> to program the timeout.</p>

<div class="language-rust highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Deadline-aware wrapper around `TcpStream.</span>
<span class="k">struct</span> <span class="n">DeadlineStream</span> <span class="p">{</span>
    <span class="n">tcp</span><span class="p">:</span> <span class="n">TcpStream</span><span class="p">,</span>
    <span class="n">deadline</span><span class="p">:</span> <span class="n">Instant</span>
<span class="p">}</span>

<span class="k">impl</span> <span class="n">DeadlineStream</span> <span class="p">{</span>
    <span class="cd">/// Create a new `DeadlineStream` that expires after some time.</span>
    <span class="k">fn</span> <span class="nf">new</span><span class="p">(</span><span class="n">tcp</span><span class="p">:</span> <span class="n">TcpStream</span><span class="p">,</span> <span class="n">timeout</span><span class="p">:</span> <span class="n">Duration</span><span class="p">)</span> <span class="k">-&gt;</span> <span class="k">Self</span> <span class="p">{</span>
        <span class="k">Self</span> <span class="p">{</span>
            <span class="n">tcp</span><span class="p">,</span>
            <span class="n">deadline</span><span class="p">:</span> <span class="nn">Instant</span><span class="p">::</span><span class="nf">now</span><span class="p">()</span> <span class="o">+</span> <span class="n">timeout</span><span class="p">,</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="k">impl</span> <span class="nn">io</span><span class="p">::</span><span class="n">Read</span> <span class="k">for</span> <span class="n">DeadlineStream</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="nf">read</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">buf</span><span class="p">:</span> <span class="o">&amp;</span><span class="k">mut</span> <span class="p">[</span><span class="nb">u8</span><span class="p">])</span> <span class="k">-&gt;</span> <span class="nn">io</span><span class="p">::</span><span class="nb">Result</span><span class="o">&lt;</span><span class="nb">usize</span><span class="o">&gt;</span> <span class="p">{</span>
        <span class="c1">// Set the deadline.</span>
        <span class="k">let</span> <span class="n">time_left</span> <span class="o">=</span> <span class="k">self</span><span class="py">.deadline</span><span class="nf">.saturating_duration_since</span><span class="p">(</span><span class="nn">Instant</span><span class="p">::</span><span class="nf">now</span><span class="p">());</span>
        <span class="k">self</span><span class="py">.tcp</span><span class="nf">.set_read_timeout</span><span class="p">(</span><span class="nf">Some</span><span class="p">(</span><span class="n">time_left</span><span class="p">))</span><span class="o">?</span><span class="p">;</span>

        <span class="c1">// Read from the stream.</span>
        <span class="k">self</span><span class="py">.tcp</span><span class="nf">.read</span><span class="p">(</span><span class="n">buf</span><span class="p">)</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="k">impl</span> <span class="nn">io</span><span class="p">::</span><span class="n">Write</span> <span class="k">for</span> <span class="n">DeadlineStream</span> <span class="p">{</span>
    <span class="k">fn</span> <span class="nf">write</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="k">self</span><span class="p">,</span> <span class="n">buf</span><span class="p">:</span> <span class="o">&amp;</span><span class="p">[</span><span class="nb">u8</span><span class="p">])</span> <span class="k">-&gt;</span> <span class="nn">io</span><span class="p">::</span><span class="nb">Result</span><span class="o">&lt;</span><span class="nb">usize</span><span class="o">&gt;</span> <span class="p">{</span>
        <span class="c1">// Set the deadline.</span>
        <span class="k">let</span> <span class="n">time_left</span> <span class="o">=</span> <span class="k">self</span><span class="py">.deadline</span><span class="nf">.saturating_duration_since</span><span class="p">(</span><span class="nn">Instant</span><span class="p">::</span><span class="nf">now</span><span class="p">());</span>
        <span class="k">self</span><span class="py">.tcp</span><span class="nf">.set_write_timeout</span><span class="p">(</span><span class="nf">Some</span><span class="p">(</span><span class="n">time_left</span><span class="p">))</span><span class="o">?</span><span class="p">;</span>

        <span class="c1">// Read from the stream.</span>
        <span class="k">self</span><span class="py">.tcp</span><span class="nf">.write</span><span class="p">(</span><span class="n">buf</span><span class="p">)</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="c1">// Create the wrapper.</span>
<span class="k">let</span> <span class="n">client</span> <span class="o">=</span> <span class="nn">DeadlineStream</span><span class="p">::</span><span class="nf">new</span><span class="p">(</span><span class="n">client</span><span class="p">,</span> <span class="nn">Duration</span><span class="p">::</span><span class="nf">from_secs</span><span class="p">(</span><span class="mi">3</span><span class="p">));</span>

<span class="k">let</span> <span class="k">mut</span> <span class="n">data</span> <span class="o">=</span> <span class="nd">vec!</span><span class="p">[];</span>
<span class="n">client</span><span class="nf">.read_to_end</span><span class="p">(</span><span class="o">&amp;</span><span class="k">mut</span> <span class="n">data</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>

<span class="k">let</span> <span class="n">response</span> <span class="o">=</span> <span class="nf">do_something_with_data</span><span class="p">(</span><span class="n">data</span><span class="p">)</span><span class="o">?</span>
<span class="n">client</span><span class="nf">.write_all</span><span class="p">(</span><span class="o">&amp;</span><span class="n">response</span><span class="p">)</span><span class="o">?</span><span class="p">;</span>

<span class="nf">Ok</span><span class="p">(())</span>
</code></pre></div></div>

<p>On one hand, it could be argued that this is elegant. We used Rust’s
capabilities to solve the problem with a relatively simple combinator. I’m sure
it would work well enough.</p>

<p>On the other hand, it’s definitely hacky.</p>

<ul>
  <li>We’ve locked ourselves into using <a href="https://doc.rust-lang.org/std/net/struct.TcpStream.html"><code class="language-plaintext highlighter-rouge">TcpStream</code></a>. There’s no trait in Rust to
abstract over using the <a href="https://doc.rust-lang.org/std/net/struct.TcpStream.html#method.set_read_timeout"><code class="language-plaintext highlighter-rouge">set_read_timeout</code></a> and <a href="https://doc.rust-lang.org/std/net/struct.TcpStream.html#method.set_write_timeout"><code class="language-plaintext highlighter-rouge">set_write_timeout</code></a> types.
So it would take a lot of additional work to make it use any kind of writer.</li>
  <li>It involves an extra system call for setting the timeout.</li>
  <li>I imagine this type is much more unwieldy to use for the kinds of actual logic
that web servers demand.</li>
</ul>

<p>If I saw this code in production, I would ask the author why they avoided using
<code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> to solve this problem. This is the phenomenon I was describing
in my post “<a href="/why-you-want-async/">Why you might actually want async in your project</a>”.
Quite frequently I encounter a pattern where synchronous code can’t be used
without contortion, so I have to rewrite it in <code class="language-plaintext highlighter-rouge">async</code>.</p>

<h2 id="async-success-stories">Async Success Stories</h2>

<p>There’s a reason why the HTTP ecosystem has adopted <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> as its
primary runtime mechanism, even for clients. You can take any function that
makes an HTTP call, and make it fit whatever hole or use case you want it to.</p>

<p><a href="https://crates.io/crates/tower"><code class="language-plaintext highlighter-rouge">tower</code></a> is probably the best example of this phenomenon I can think of, and
it’s really <em>the</em> thing that made me realize how powerful <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> can
be. If you implement your service as an <code class="language-plaintext highlighter-rouge">async</code> function, you get timeouts,
rate limiting, load balancing, <a href="https://docs.rs/tower/0.4.13/tower/hedge/index.html">hedging</a>
and back-pressure handling. All of that for free.</p>

<p>It doesn’t matter what runtime you used, or what you’re actually doing in your
service. You can throw <a href="https://crates.io/crates/tower"><code class="language-plaintext highlighter-rouge">tower</code></a> at it to make it more robust.</p>

<p><a href="https://docs.rs/macroquad"><code class="language-plaintext highlighter-rouge">macroquad</code></a> is a miniature Rust game engine that aims to make game development
as easy as possible. Its main function uses <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> in order to run its
engine. This is because <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> is really the best way in Rust to
express a linear function that needs to be stopped in order to wait for
something else.</p>

<p>In practice, this can be extremely powerful. Imagine simultaneously polling
a network connection to your game server and your GUI framework, on the same
thread. The possibilities are endless.</p>

<h2 id="improving-asyncs-image">Improving Async’s Image</h2>

<p>I don’t think the issue is that some people think threads are better than
<code class="language-plaintext highlighter-rouge">async</code>. I think the issue is that the benefits of <code class="language-plaintext highlighter-rouge">async</code> aren’t widely
broadcast. This leads some people to be misinformed about the benefits of
<code class="language-plaintext highlighter-rouge">async</code>.</p>

<p>If this is an educational problem, I think it’s worth taking a look at the
educational material. Here’s what the <a href="https://rust-lang.github.io/async-book/01_getting_started/02_why_async.html">Rust Async Book</a> says when comparing
<code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> to operating system threads.</p>

<blockquote>
  <p><strong>OS threads</strong> don’t require any changes to the programming model, which makes it very easy to express concurrency. However, synchronizing between threads can be difficult, and the performance overhead is large. Thread pools can mitigate some of these costs, but not enough to support massive IO-bound workloads.</p>

  <p><em>- <a href="https://rust-lang.github.io/async-book/01_getting_started/02_why_async.html">Rust Async Book</a>, various authors</em></p>
</blockquote>

<p>I think this is a consistent problem throughout the <code class="language-plaintext highlighter-rouge">async</code> community. When
someone asks the question of “why do we want to use this over OS threads”,
people have a tendency to kind of wave their hand and say “<code class="language-plaintext highlighter-rouge">async</code> has less
overhead. Other than that, everything’s the same.”</p>

<p>This is the reason why web server authors switched to <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code>. It’s how
they solved the <a href="https://en.wikipedia.org/wiki/C10k_problem">C10k problem</a>. But, it’s not going to be the reason why
everyone else switches to <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code>.</p>

<p>Performance gains are
fickle and can disappear in the wrong circumstances. There are plenty of cases
where a threaded workflow can be faster than an equivalent <code class="language-plaintext highlighter-rouge">async</code> workflow
(mostly, in the case of CPU bound tasks). I think that we, as a community, have
over-emphasized the ephemeral performance benefits of <code class="language-plaintext highlighter-rouge">async</code> Rust while
downplaying its semantic benefits.</p>

<p>In the worst case, it leads to people shrugging off <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> as
“<a href="https://shnatsel.medium.com/smoke-testing-rust-http-clients-b8f2ee5db4e6">a weird thing that you resort to for niche use cases</a>”. It should be seen as
a powerful programming model that lets you succinctly express patterns that
can’t be expressed in synchronous Rust without dozens of threads and channels.</p>

<p>I also think there’s a tendency to try to make <code class="language-plaintext highlighter-rouge">async</code> Rust “just like sync
Rust” in a way that encourages negative comparison. By “tendency”, I mean that
it’s <a href="https://blog.rust-lang.org/inside-rust/2022/02/03/async-in-2022.html">the stated roadmap for the Rust project</a>,
saying that “that writing async Rust code should be as easy as writing sync code, apart from the occasional <code class="language-plaintext highlighter-rouge">async</code> and <code class="language-plaintext highlighter-rouge">await</code> keyword.”.</p>

<p>I reject this framing because it’s fundamentally impossible. It’s like trying to
host a pizza party on a ski slope. Sure, you can probably get 99% of the way
there, especially if you’re really talented. But there are differences that the
average bear <em>will</em> notice, no matter how good you are.</p>

<p>We shouldn’t be trying to force our model into unfriendly idioms to
appease programmers who refuse to adopt another type of pattern. We should be
trying to highlight the strengths of Rust’s <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code> ecosystem; its
composability and its power. We should be trying to make it so <code class="language-plaintext highlighter-rouge">async</code>/<code class="language-plaintext highlighter-rouge">await</code>
is the <em>default</em> choice whenever a programmer reaches for concurrency. Rather
than trying to make sync Rust and <code class="language-plaintext highlighter-rouge">async</code> Rust the same, we should embrace the
differences.</p>

<p>In short, we shouldn’t be using technical reasons to argue for a semantic model.
We should be using semantic reasons.</p>]]></content><author><name>John Nunley</name></author><category term="async" /><category term="rust" /><category term="smol" /><summary type="html"><![CDATA[A common refrain is that threads can do everything that async/await can, but simpler. So why would anyone choose async/await?]]></summary></entry></feed>