Skip to content

perf: union rdeps queries across changed modules into one subprocess#337

Open
rdark wants to merge 1 commit intoTinder:masterfrom
rdark:perf/union-rdeps-queries
Open

perf: union rdeps queries across changed modules into one subprocess#337
rdark wants to merge 1 commit intoTinder:masterfrom
rdark:perf/union-rdeps-queries

Conversation

@rdark
Copy link
Copy Markdown

@rdark rdark commented Apr 28, 2026

Partial fix for #335

CalculateImpactedTargetsInteractor.queryTargetsDependingOnModules previously spawned one bazel query rdeps(//..., @<repo>//...) subprocess per changed-module-matching canonical repo. A single changed module can substring-match thousands of repos in a bzlmod workspace, and each subprocess pays ~2s of JVM + bazel-client-connect overhead serially.

Union all matched repos into a single rdeps(//..., @@a//... + @@b//... + ...) query. Bazel computes the reverse-dep graph of //... once regardless of how many patterns are in the union, so runtime collapses from N × (startup + analysis) to 1 × (startup + analysis); the N-1 eliminated subprocesses are the bulk of the saving.

  • No command-line length concern: BazelQueryService.runQuery writes queries via --query_file, so the query string is arbitrary size.
  • Failure semantics: a single try/catch wraps the unioned query; on failure, fall back to marking all workspace targets impacted. The previous outer catch-all is removed - audit confirmed every throwable call now has a tight try/catch around it, and the broad catch was silently swallowing errors.
  • Per-module log line preserves "module X → matched N repos: ..." attribution so as to not break logging contract
  • Tests: adds testUnionsRdepsAcrossChangedModules with two changed modules to assert the single-query invariant.

CalculateImpactedTargetsInteractor.queryTargetsDependingOnModules
previously spawned one `bazel query rdeps(//..., @@<repo>//...)`
subprocess per changed-module-matching canonical repo. A single changed
module can substring-match thousands of repos in a bzlmod workspace,
and each subprocess pays ~2s of JVM + bazel-client-connect overhead
serially.

Union all matched repos into a single `rdeps(//..., @@a//... + @@b//...
+ ...)` query. Bazel computes the reverse-dep graph of //... once
regardless of how many patterns are in the union, so runtime collapses
from N × (startup + analysis) to 1 × (startup + analysis); the N-1
eliminated subprocesses are the bulk of the saving.

- No command-line length concern: BazelQueryService.runQuery writes
  queries via --query_file, so the query string is arbitrary size.
- Failure semantics: a single try/catch wraps the unioned query; on
  failure, fall back to marking all workspace targets impacted. The
  previous outer catch-all is removed - audit confirmed every throwable
  call now has a tight try/catch around it, and the broad catch was
  silently swallowing errors.
- Per-module log line preserves "module X → matched N repos: ..."
  attribution so as to not break logging contract
- Tests: adds testUnionsRdepsAcrossChangedModules with two changed
  modules to assert the single-query invariant.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant