Homepage

Matklad's most recent talk Building Systems, Simply was really eye opening and inspiring for me. I'm a young developer with almost no real-world professional experience, but I still like to read a lot of opinions, articles and blogs about programming, programming history and software engineering practices.

A recent trend among things I've read is the idea that complexity is something that should be avoided as much as possible, that software engineering practices have been led astray by "best practices", over-abstractions and a general over complication of things that don't need to be complicated.

Even though I haven't actually encountered the so called complexity demons in the wild, I still resonate with this idea that simplicity is paramount to software sustainability, but it's sometimes hard for me to actually judge which is the simple solution and which isn't. Remember, simple ain't easy and if the tradeoffs between simplicity and complexity were always clear-cut we probably wouldn't be in this situation. The reality is that sometimes incidental and inherent complexity are actually really hard to distinguish, specially for someone with less experience.

That's why I enjoyed this talk so much, Matklad gave real-life examples of how this philosophy of simple systems is carried out at TigerBeetle - a financial transaction database written in zig. His first point is about project structure and how making onboarding and day-to-day tasks simple (and fast) makes for a better developer experience:

He also talks about TigerBeetle's fuzzer, which runs non-stop on a fleet of machines (on different operating systems) on all branches and pull requests. The problem he states is this: If you modify the fuzzer's source code, how can you upgrade it on all machines? This can be solved in a lot of ways, some of the alternatives he suggests are Kubernetes and Systemd. Their actual solution, however, is remarkably simple. The machines run a script (which is some 20 lines of shell) which clones the git repo from scratch, installs the compiler (using their script), builds the code and runs it. This script repeats ad-infinitum (and importantly, doesn't change since it's manually copied and then executed from the machine), which means that whenever a new change is made to the main branch, all machines running the fuzzer will finish their last fuzzing job and automatically upgrade to the newest fuzzer version.

This is only possible since all the operations in the script are really fast: The repository is kept small so a git clone is very fast, the compiler install script doesn't take long and compilation is also fast. This is a really nice reminder that performance is a feature, since if any of these operations were slow, this workflow just wouldn't work, they'd need to reach into another tool to handle the job. It would probably work fine, maybe even better, but it would definitely be more complex.

Another interesting thing is that in general TigerBeetle shies away from anything that isn't zig, since as the saying goes "the right tool for the job is the one you already have". For example, they guard against big files inside the repository by implementing that logic inside a zig1 test2, not a shell script, not a git hook neither a github action. He goes on to show more things which would generally be done as a shell script/github actions yaml instead be done entirely in zig.

The only places where this is not the case are also very interesting case studies of inherent complexity: The zig toolchain install script, and the multi-platform CI system. For the zig toolchain install script, since you must assume the user doesn't have a zig toolchain available, you're obviously going to need to rely on something else, in this case, shells. But even here, they carefully craft a single polyglot script which runs just fine on Linux, Mac and Windows that shells out to the actual system dependent scripts. This is valuable since it eliminates operating system specific instructions.

The other place where they have an external "dependency" is on their CI. Even though most of their core CI logic is implemented in Zig, setting up machines on multiple different operating systems and versions is something that is inherently complex, and there's no way around that. Here TigerBeetle doesn't look for an in-house solution but uses a tool specifically built for this: Github Actions. They only have one action yaml whose only job is provisioning different architecture/system machines and running all zig code/tests on them.

Most of the content on simplicity and complexity generally repeats the saying that complexity should be avoided unless completely necessary. While I do agree with this statement (and I'd wager most developers do too), the difficult thing is actually knowing what is necessary complexity, and what isn't. That's why I really enjoyed his talk, he demonstrated what - him and the TigerBeetle team - considered inherent complexity and what they didn't, and what they actually did about it.

  1. Matklad makes it very clear in the talk that this would obviously be better done in something more scripty like shell, Python, Perl, etc. However any one of those would add more complexity to the project: shells are not really portable, so you'd need at least two different versions, one for windows and one for Mac+Linux; Python (and any other interpreted language) is a big dependency to add, specially since it would be only for this use case.

  2. This actually led me to wonder, what other checks can we put inside tests? The most obvious ones are code style, formatting, maybe even git message validations! Since the tests are run inside the git repo, we can actually inspect the latest commit for what we want (has to have a description as well as a summary, link to relevant issue, etc. etc.)