I hate to be the first one to tell you this, but Rust projects tend to have a lot of dependencies.
Disclaimer: I will take a look at some publicly available Rust crates today. Please do not harass their authors, maintainers or users.
I’m not kidding. Let’s check out ripgrep
, one of the most popular Rust
programs of all time. We can check the number of dependencies fairly easily by
cloning the project, then running cargo tree
through a nightmarish sed
invocation.
$ git clone https://github.com/BurntSushi/ripgrep
$ cargo tree -e no-dev --prefix none | \
sed -e 's/(\*)//g' -e 's/^[ \t]*//;s/[ \t]*$//' | \
sort -u | wc -l
33
To break down each step of the shell command:
cargo tree
prints the dependencies of the current Cargo package in a “tree” format, that lets you see which dependencies are brought in by which other dependencies.-e no-dev
skips dev dependencies, which are only used for testing. By defaultcargo
uses a tree formatting that we can skip using the option--prefix none
.sed
is a simple core utility that lets you run regular expressions on each line of a command’s output. Each expression is denoted by-e
.- The first expression,
s/(\*)//g
, is a simple replacement that says “replace (s
) every instance of(*)
with nothing,g
lobally”.cargo tree
adds this to repeated dependencies and I’d like to remove that. - The second expression
s/^[ \t]*//;s/[ \t]*$//
removes whitespace at the start and end of each line. This just cleans up the output in such a way that it doesn’t mess with our uniqueness test later.
- The first expression,
sort -u
sorts every line and deletes duplicated lines.wc -l
prints the total number of lines.
Thirty-tree dependencies isn’t that bad for a highly advanced text matching
program. Most of the dependencies are things like the Rust regex
engine, or
highly optimized text matching libraries.
Frankly, I can find a fairly good reason for these dependencies. Then again, it’s
a small command line utility, so it’s not like I’d expect it to have a million
dependencies. Great job!
So let’s try something a little more complicated. Specifically, let’s try a
networking application, since those are known for having unneeded dependencies.
I’ll pick one off the awesome list…
ah, miniserve
.
It should be a relatively small application, right? So let’s check the dependency count.
$ git clone https://github.com/svenstaro/miniserve
$ cargo tree --no-default-features -e no-dev --prefix none | \
sed -e 's/(\*)//g' -e 's/^[ \t]*//;s/[ \t]*$//' | \
sort -u | wc -l
281
I specifically added --no-default-features
to remove unneeded dependencies,
and we still have a grand total of two hundred and eighty one dependencies.
That’s quite a few! Looking into the dependency list, we can see:
- A QR code generator.
- The
rand
crate, both version v0.8.5 and version v0.9.1, as well asfastrand
. actix-web
, which brings in all oftokio
and various crates for handling internet edge cases.- Further duplicates of
base64
,hashbrown
,syn
andzerocopy
.
I can justify a lot of these dependencies. The internet is a complicated place, and anything that needs to face the public ‘net and respond to 99% of web clients needs to deal with that. Still, thats 281 pieces of code that the maintainers have to audit. Imagine if one of those dependencies is compromised, or just becomes unmaintained. Not to mention the fact that there are two copies of these pieces of code compiled into every one of these crates.
I’m not here to gawk at dependency graphs. Anyone can do that. I’d like to identify the scope of the problem, and see if we can figure out some solutions.
Dependency Dragback
To be clear, this problem isn’t just something randoms complain about. I’ve been on both sides of this issue. I’ve been in security audits where “the amount of code that gets pulled in” is an active demerit against integrating Rust into a project. I’ve also been the author os libraries where lowering the dependency count becomes a big issue.
I’d also like to be clear that this isn’t a problem unique to Rust. People have been complaining about extraneous packages on JavaScript and Python ever since they got package managers. JavaScript is famous for its “leftpad” incident, as well as simple pieces of code requiring two hundred dependencies. Even C++ runs into dependency problems once you add Boost to the equation.
In all of these languages, there’s two types of dependencies, in my opinion.
- Dependencies that do something you don’t want to do yourself. You don’t want
to implement an HTTP server, so you pull in
axum
. - Dependencies that act as the canonical interface to some system facility or hardware device. Think the C library for making system calls1, or the OpenGL library for putting things on the screen with the GPU.
To be clear, there are very good reasons to pull in the first kind of dependency. For things like cryptography or networking, writing low-level operations yourself is a one-way trip to security-breach-ville. Even for less critical operations, it’s usually better to use a tried-and-true, tested library than going and writing it yourself.
There’s also something to be said here about code reuse; there’s no reason why
there should be more than one implementation of some algorithm in the Rust
community, necessitating the splitting of work between experts. Why have one
group of people review one piece of code and have another group of people review
another piece of nearly identical code, when it would be better to have both
groups review only one crate. This idea is the core reason why the
x11rb-protocol
crate was created.
That being said, there are some thing that are just so trivial that using separate
crates for them is far-and-away overkill. I’ve had to deal with crates that
depend on heavy hitters like regex
and nom
for byte-munching operations
that were able to be implemented in basic slicing.
The worst offender for this problem is scopeguard
.
There are very few use cases where this crate is economical over a few extra lines
of simple Rust code. Here is a quick polyfill:
// Before:
scopeguard::defer! {
do_the_thing();
}
// After:
struct CallOnDrop<F: FnMut()>(F);
impl<F: FnMut()> Drop for CallOnDrop<F> {
fn drop() {
(self.0)();
}
}
let _bomb = CallOnDrop(|| do_the_thing());
Safety Spinoff
Going back to my earlier point. These two types of dependencies exist in pretty much all languages. I argue that Rust has a third type:
- “Safety quarantine” crates that wrap some unsafe features in a safe wrapper.
There are simpler ones like bytemuck
which just wrap around simple data
transmutations in a way that the standard library hasn’t gotten around to
exposing yet. Then there are the C library wrappers like zstd
which take
a C library and make it safe. I’m largely talking about the bytemuck
-type
wrappers here.
When used effectively, these kind of “safety quarantine” crates let you isolate
unsafe code to specific parts of your dependency graph so you can be sure the rest
is safe. If the other crates in your dependency tree use #![forbid(unsafe_code)]
,
you can be sure that any behavioral problems will only come from those
“quarantined” unsafe code crates.
In the dependency tree of smol
, we use this
strategy to great effect. It’s impossible to create a performant executor without
some level of unsafe code. So we isolate all of the unsafe code to
async-task
, so async-executor
can have almost no unsafe code. Similar, efficient lock-free channels require
unsafe code to implement on some level. So we isolate all of the unsafe code to
concurrent-queue
so that the
async-channel
crate can be
#![forbid(unsafe_code)]
. While smol
may have quite a few dependencies, many
of these dependencies exist in such a way that it completely eliminates unsafe
code.
I think it would be nicer if more crates did this. eframe
, one of the more
popular GUI crates on the market, has around 120 dependencies. I wonder how much
more palatable it would be if more of those dependencies were written with
entirely safe code.
What do?
Still, many crates end up just having a smidgeon of unsafe code in them, however justified. Not to mention, safe code is code you still have to audit. So while there’s nothing wrong with handfuls of micro-crates that each do something well, we should still seek a way to reduce the amount of dependencies in our dependency tree.
The first measure we can take is somewhat obvious, for both application and
library developers. It’s possible to minimize features, which often minimizes
dependencies. You can run cargo add
with the --no-default-features
flag, like so:
$ cargo add nom --no-default-features
In my own applications, I usually start by adding dependencies without default features, and then adding features piecemeal whenever I need them. Because of this, I can usually keep my dependency tree to a minimum.
When this fails, don’t be afraid to switch to another, parallel crate. For many
dependency-heavy crates, there’s usually an alternative crate with much fewer
dependencies, often while retaining the core functionality you depend on. For
example, instead of futures
, consider using futures-lite
. Instead of
actix-web
, maybe see if your use case can be fulfilled by axum
.
I’ve found that, with these two strategies, you can minimize the dependencies of your crate or application. Usually, to a much more manageable level.