TL;DR
“Agents in agents driving scripts” is a pattern for handling large-scale, mechanical changes using AI. Each layer of the pattern is developed iteratively, using failures from previous runs to refine either the prompts or the scripts being used.
The long version
I’m giving a worked example in this post in order to demonstrate how to apply the idea. This post isn’t about either bazel or gazelle. With that out the way….
I’ve been exploring how I might generate build files in a large Bazel-based repo which currently uses hand-crafted build files. The theory of the migration is simple. We know the code compiles now, and that information is encoded in the build graph. We can use bazel query to inspect this graph, and the graph output format makes it possible to topologically sort these by their dependencies. Once that’s done, we can start the migration, beginning at the directories with no dependencies and ending with those which need the entire build graph migrated.
We can tell whether a build file is being managed by gazelle by checking that neither the exclude nor ignore annotation is present.
With this information in mind, the process starts by setting up gazelle in the repo, and updating all the build files to indicate that gazelle shouldn’t process them by adding the exclude directive. Once that’s done the fun can begin.
Ultimately, I know the process of migrating a build file to gazelle is pretty mechanical. I also know that running a script is often faster, cheaper, and more deterministic than using an LLM for the same task.
But we don’t have the script yet. So we use an LLM to write one.
To begin with, I start an interactive LLM and work with it to migrate a single directory. Once I’m happy with that, I ask the LLM to create a (python!) script to perform the same actions. The key thing here is to insist that if there were any errors or problems, the script must fail and report the problem.
The next step is to write a prompt that uses this script and write that to a file. The prompt checks that we’re ready to start (is the build still green, is running gazelle a no-op, etc) and then runs the script to migrate the directory, and then is told how to validate that the change is safe (normally very similar to the start conditions, with some extra checks). The end goal is to “one shot” an update, so the prompt is told to stop if there are any issues at all.
This is where the iterative improvement happens. Each time an issue stops the process, we use that failure to immediately update the script, add additional scripts, or refine the original prompt to avoid future problems. This continuous feedback loop ensures that the next iteration will be smoother
Once we achieve this single-directory ‘one-shot’ success, we move up a level, and start on an orchestration prompt. Again, we start this manually. My approach is to tell the orchestrator to start a number of subagents, each of which should use the polished prompt and script we’ve developed above. Again, if there are any problems, the orchestrator should stop everything and report the issues, grouping them as required. Each time we resolve a problem, we either ask for the scripts or the prompts to be updated. It’s an iterative process.
Then we get to the fun bit.
We start the orchestration agent using our orchestration prompt, that fires up the subagents using the prompt for the migration, and that uses the scripts we’ve been refining. Agents in agents driving scripts.
Why “Dorodango”?
Dorodango is the Japanese art of polishing mud, taking something that’s rough and unformed and, by iterating and smoothing, making a beautiful sphere out of the most unlikely starting points. In the same way, this pattern takes a rough and unformed idea, and by iterating and smoothing ends up with an efficient way of managing change.
Comparison with Wiggum Loops
While both the Dorodango Technique and Wiggum Loops are methodologies for using LLMs in iterative, self-correcting development, they differ in their structure and approach to failure. The Dorodango technique is about iterating on the design (the scripts and prompts) after a failure in order to build a robust system. It uses the LLM to write the system, but you are in the loop to refine it. Wiggum Loops are about iterating on the code/output until the LLM successfully completes a well-defined task autonomously. It is designed for maximum automation once the loop is started.