Your Roots Are Showing

May 10, 2023
bazel tech

To know where we’re going, it helps to know where we’ve come from. While this is true for ourselves, it’s also true of software. It helps explain why some of the things that look unusual or unlikely have been done that way, and helps demonstrate some of the forces that have acted on the design of an apps UI and UX.

Why do I mention this? Because when I introduce people to Bazel I find it helpful to explain where the tool came from in order to understand why it has the UI that it does. A lot of this is hinted at in the post that Mike Bland wrote over 10 years ago, but perhaps now is the time to flesh out the story a little more. I’d suggest going to read his post before carrying on here. It’s an interesting read, and we’re not in a rush. Go for it.

The first place where Bazel’s roots are showing is in its own name. Looking at the source of Bazel, it should come as no surprise that it’s derived from blaze: Google’s own build tool. Indeed, “bazel” is an anagram of “blaze”.

But blaze wasn’t created in a vacuum. When it was introduced at Google, it replaced the older build system, which relied on a two-step process to perform a build. The first step was to run a tool that took build files and converted them into a Makefile. The second step was to run the build itself.

The build files were an amazing abstraction. Rather than describing the individual steps required to build an artifact, they simply described the kinds of artifacts to be built. If you saw one now, it would feel remarkably familiar. Without that abstraction, I’m not sure how easy it would have been to keep growing the Google monorepo.

The build files needed to be written in a language of some sort, and at Google there were (notoriously) four “blessed” languages for writing code: C++ for performance critical code and if you liked it, Java for other server-side code, JS because that’s what you needed to run in a browser, and Python for everything else. Clearly, the sensible choice from this list if you needed a programmatic way of describing your build was Python.

And, sure enough, originally the build files were interpreted using Python.

As a little historical note, this is the same approach we took when we were developing Buck too, and that shouldn’t come as a surprise since the team working on Buck were largely part of the Xoogler diaspora. But I digress….

However, there’s one huge problem with interpreting user-supplied build files written in fragments of Python in a build tool that’s meant to be deterministic and reproducible: you can do just about anything, including futzing with the file system, or reaching out to network resources. Worse, there was no way being able to determine whether “parsing” the build files would ever finish, or could be done without undue computational load on the machine doing the build.

So, it was decided that it was better to use a tightly constrained subset of Python. By providing a different interpreter, it would be possible to avoid accidentally relying on modules that were only installed on a handful of machines. It would also be possible to prove that parsing the build files would complete (yay! No halting problem!)

And if you go and read the goals of Starlark, you’ll see that this is exactly what happened. Put another way, Starlark is another place where the roots of Bazel shine through — it looks like Python because at one stage is was Python, and it was simpler to slowly tighten the constraints of what was allowed in build files over time than to rewrite every build file in the whole of Google’s monorepo. Fortunately, originally most of the build files weren’t doing anything fancy to begin with, and so could be interpreted using this new subset of Python.

But we’re not done yet! There’s one other thing that Mike mentions in his post that is pertinent to this discussion of Bazel’s past leaking into its UX, and that is that Google used Perforce for source control.

Now, if you’re been fortunate enough to be introduced to source control in the modern age, you may not be aware of just how many source control systems there used to be. In the Open Source world, the move from RCS to CVS allowed us to group changes to multiple files into a single commit. The move from CVS to Subversion made those commits atomic (prior to Subversion, if two people used CVS to commit a change at the same time, it was possible for two separate commits to get the same revision number, and that lead to plenty of hilarity).

But there weren’t just Open Source source control systems out there. For example, the well-known falling out of the Linux Kernal devlopers with BitKeeper lead Linus Torvalds to create git, and which also lead to the creation of mercurial.

But not all source control tools are, or were, Open Source. There were many commercial ones, and Google had settled on Perforce, which had a reputation for being flexible, fast, and capable.

The way that Perforce works is that you create a Perforce client. This is done by specifying paths within the repo that you want to check out, and then run the p4 tool to get everything dragged down from the Perforce server to your local disk.

These paths will look familiar to anyone who’s used Bazel because they look exactly like the label syntax that is used for specifying targets, //they/look/like/this/...

The original tooling at Google took advantage of this by providing a utility that allowed a developer to clone a minimal but sufficient part of the larger monorepo. They did this by specifying the build targets to build, finding the relevant build files by converting build target paths to the perforce equivalents (a trivial transformation), and then parsing those to extract more paths, and so on, until you had everything you needed.

So, this has been a lot of words to describe three places where Bazel’s history have leaked into its current incarnation:

  1. The name is an anagram of blaze.
  2. Starlark looks like Python because it once was Python.
  3. Bazel labels look like Perforce paths because they were originally Perforce paths
More recently There's No Such Thing as a Free Lunch     The Social Expectations of Source Repos Less recently

A New Approach to CI

September 5, 2023
bazel monorepo tech

There's No Such Thing as a Free Lunch

June 12, 2023
bazel monorepo tech

The Social Expectations of Source Repos

October 24, 2022
monorepo tech