I worked on building this at $PREV_EMPLOYER. We used a single repo for many services, so that you could run tests on all affected binaries/downstream libraries when a library changed.
We used Bazel to maintain the dependency tree, and then triggered builds based on a custom Github Actions hook that would use `bazel query` to find the transitive closure of affected targets. Then, if anything in a directory was affected, we'd trigger the set of tests defined in a config file in that directory (defaulting to :...), each as its own workflow run that would block PR submission. That worked really well, with the only real limiting factor being the ultimate upper limit of a repo in Github, but of course took a fair amount (a few SWE-months) to build all the tooling.
We’re in the middle of this right now. Go makes this easier: there’s a go CLI command that you can use to list a package’s dependencies, which can be cross-referenced with recent git changes. (duplicating the dependency graph in another build tool is a non-starter for me) But there are corner cases that we’re currently working through.
This, and if you want build + deploy that’s faster than doing it manually from your dev machine, you pay $$$ for either something like Depot, or a beefy VM to host CI.
A bit more work on those dependency corner cases, along with an auto-sleeping VM, should let us achieve nirvana. But it’s not like we have a lot of spare time on our small team.
* In addition, you can make your life a lot easier by just making the whole repo a single Go module. Having done the alternate path - trying to keep go.mod and Bazel build files in sync - I would definitely recommend only one module per repo unless you have a very high pain tolerance or actually need to be able to import pieces of the repo with standard Go tooling.
> a beefy VM to host CI
Unless you really need to self-host, Github Actions or GCP Cloud Build can be set up to reference a shared Bazel cache server, which lets builds be quite snappy since it doesn't have to rebuild any leaves that haven't changed.
I've heard horror stories about Bazel, but a lot of them involve either not getting full buy in from the developer team or not investing in building out Bazel correctly. A few months of developer time upfront does seem like a steep ask.
We used Bazel to maintain the dependency tree, and then triggered builds based on a custom Github Actions hook that would use `bazel query` to find the transitive closure of affected targets. Then, if anything in a directory was affected, we'd trigger the set of tests defined in a config file in that directory (defaulting to :...), each as its own workflow run that would block PR submission. That worked really well, with the only real limiting factor being the ultimate upper limit of a repo in Github, but of course took a fair amount (a few SWE-months) to build all the tooling.