Add lightweight bi output group to skip linking during Index Build#3302
Add lightweight bi output group to skip linking during Index Build#3302brentleyjones merged 1 commit intomainfrom
bi output group to skip linking during Index Build#3302Conversation
Introduce a new `bi` (build indexing) output group that contains only indexstores and their filelist, without linked product binaries. Index builds use this instead of `bp` to trigger transitive Swift compilation (producing swiftmodules, indexstores, and generated headers) while avoiding materialization of large linked binaries. This reduces remote cache download volume and build time for index builds, since only compilation artifacts needed for indexing are requested. Signed-off-by: Ryo Aoyama <r.fe51028.r@gmail.com>
4a9fa8d to
6ab6c0f
Compare
|
Note: This fix addresses the local build path by eliminating linking during Index Build. For remote cache hits, unnecessary downloads are caused by a regex bug in |
brentleyjones
left a comment
There was a problem hiding this comment.
I think this might cause SwiftCompile and/or SwiftDeriveFiles to not trigger anymore since indexing is only one side of the action split. When using that mode .o is on the other half of .indexstore, and we need both (I believe). We definitely need to make sure that we are still generating all files needed by indexing.
Another reason I removed the extra output group was to save on starlark memory usage. There was lots of profiling to improve on memory usage and I don't really want to regress.
I'm not blocking this, just stating how this isn't an easy clear win.
|
Thanks for the input! I dug into both points. On the action split concern, I traced through On the memory side, totally fair concern. The actual new cost per target is one Happy to gate this behind a flag if you still think it's worth it, though the per-target overhead seems pretty minimal in practice. |
|
For the memory concern, each output group added a lot to the memory overhead since we add this to every focused target. Did you profile this with the starlark memory profiler? |
|
I haven't profiled it yet. I'll try measuring it with |
|
Yes, but also |
|
Profiled it with Memory (
CPU (
Both differences are within sampling noise. The "after" memory is actually slightly lower, and the CPU difference in output_files.bzl is ~1.3 ms across 130k targets. This makes sense to me since Does this address your concern, or is there something else you'd like me to look into? |
Not sure if it makes a difference, but the form I used before was: I am a surprised there isn't a regression here. How large of a project are you testing it on? I recall large savings for each output group I eliminated when testing on a large project. |
|
The project I tested on is probably not as large as what you used before, but it's a monorepo and the xcodeproj target covers most of the project (some targets are excluded). The analysis configured 129,912 targets across 1,603 packages. I believe it's a fairly large project. For the profiling approach, I targeted the generator directly via the canonical label with the inner output base ( I think the reason there's no regression here is that |
|
It may seem small, but the creation and the existence of the new string for the output group, which has to be held in memory for every target, was part of the cost. |
|
Thanks for the opinion. I also considered modifying bp directly, but adding a separate bi group felt safer since it avoids changing existing output group behavior that other things might depend on. That said, if you'd prefer to keep the output group count down, I could gate bi behind a flag so it's only registered when needed. Would that work? |
|
I just wanted to make sure there weren't major regressions. If this till downloads all files needed for indexing (which |
|
Thanks! Will keep an eye out for any issues and happy to help fix anything that comes up. |
Summary
Introduce a new
bi(build indexing) output group that contains only indexstores and their filelist, and switch Index Build to usebiinstead ofbp. This eliminates linking during Index Build, dramatically reducing local storage consumption and build time.Problem
Index Build currently requests the
bp(build products) output group, which includes linked product binaries (test bundles, app bundles). As noted in the existing comment:The products are not used for indexing. They only serve as a trigger for transitive compilation. However, requesting linked products has significant side effects:
Local builds: Bazel's action graph is demand-driven. Requesting linked products forces the linker to run for every target, producing large binaries. In a large project with many test targets, this results in hundreds of GB of unnecessary linked output.
Remote cache hits: Even with
--remote_download_regex, linked product binaries in thebpoutput group are subject to download via--remote_download_outputs=toplevel(configured inxcodeproj.bazelrc), further inflating local storage.Solution
Add a new
bioutput group containing only:transitive_indexstores: indexstore directories that trigger Swift compilationindexstores_filelist: file list consumed byimport_indexstoresBoth are already collected in
output_files.bzlbut were not exposed as a separate output group.Why
bicorrectly triggers all necessary compilationBazel's action graph is demand-driven: it executes only actions required to produce requested outputs.
SwiftCompileis a single action that co-produces:.indexstore(requested viabi).swiftmodule.swiftdoc.swiftsourceinfo-Swift.h(generated Objective-C header).o(object file)Since indexstores are a direct output of
SwiftCompile, requesting them triggers the full compilation for every transitively depended Swift module, covering the same compilation scope asbp. The critical difference is that no linking action is triggered, because no linked product is in any requested output group.The
--remote_download_regex(already present in the script) then filters which compilation outputs to actually download from remote cache (.swiftmodule,.swiftdoc, headers, etc.), excluding unnecessary files like.o.ObjC compatibility
ObjC module indexing is unaffected. In rules_xcodeproj, ObjC/C indexing is handled by Xcode's own clang compiler, and
-index-store-pathis explicitly stripped from CC flags passed to Bazel. ObjC compilation parameters are delivered through thebcoutput group, which remains unchanged.Results
Tested in a large-scale production monorepo. Before the change, Index Build's
bazel-outdirectory consumed over 1TB of local storage due to linked product binaries that were unnecessary for indexing. After switching tobi, storage dropped to a few GB. No test bundles or app binaries are materialized, and Index Build completes significantly faster since linking is entirely skipped.Considerations
This is a behavioral change to Index Build's output group selection. Testing confirms it works correctly for both Swift and ObjC modules in a large production codebase, but if a more conservative rollout is preferred, this could be gated behind an
xcodeprojrule attribute (e.g.,index_build_output_group) to allow projects to opt in. Happy to add that if desired.