Skip to content

Add lightweight bi output group to skip linking during Index Build#3302

Merged
brentleyjones merged 1 commit intomainfrom
add-bi-output-group
Apr 10, 2026
Merged

Add lightweight bi output group to skip linking during Index Build#3302
brentleyjones merged 1 commit intomainfrom
add-bi-output-group

Conversation

@ra1028
Copy link
Copy Markdown
Member

@ra1028 ra1028 commented Apr 10, 2026

Summary

Introduce a new bi (build indexing) output group that contains only indexstores and their filelist, and switch Index Build to use bi instead of bp. This eliminates linking during Index Build, dramatically reducing local storage consumption and build time.

Problem

Index Build currently requests the bp (build products) output group, which includes linked product binaries (test bundles, app bundles). As noted in the existing comment:

"Products (i.e. bundles) and index store data. The products themselves aren't used, they cause transitive files to be created."

The products are not used for indexing. They only serve as a trigger for transitive compilation. However, requesting linked products has significant side effects:

  1. Local builds: Bazel's action graph is demand-driven. Requesting linked products forces the linker to run for every target, producing large binaries. In a large project with many test targets, this results in hundreds of GB of unnecessary linked output.

  2. Remote cache hits: Even with --remote_download_regex, linked product binaries in the bp output group are subject to download via --remote_download_outputs=toplevel (configured in xcodeproj.bazelrc), further inflating local storage.

Solution

Add a new bi output group containing only:

  • transitive_indexstores: indexstore directories that trigger Swift compilation
  • indexstores_filelist: file list consumed by import_indexstores

Both are already collected in output_files.bzl but were not exposed as a separate output group.

Why bi correctly triggers all necessary compilation

Bazel's action graph is demand-driven: it executes only actions required to produce requested outputs. SwiftCompile is a single action that co-produces:

  • .indexstore (requested via bi)
  • .swiftmodule
  • .swiftdoc
  • .swiftsourceinfo
  • -Swift.h (generated Objective-C header)
  • .o (object file)

Since indexstores are a direct output of SwiftCompile, requesting them triggers the full compilation for every transitively depended Swift module, covering the same compilation scope as bp. The critical difference is that no linking action is triggered, because no linked product is in any requested output group.

The --remote_download_regex (already present in the script) then filters which compilation outputs to actually download from remote cache (.swiftmodule, .swiftdoc, headers, etc.), excluding unnecessary files like .o.

ObjC compatibility

ObjC module indexing is unaffected. In rules_xcodeproj, ObjC/C indexing is handled by Xcode's own clang compiler, and -index-store-path is explicitly stripped from CC flags passed to Bazel. ObjC compilation parameters are delivered through the bc output group, which remains unchanged.

Results

Tested in a large-scale production monorepo. Before the change, Index Build's bazel-out directory consumed over 1TB of local storage due to linked product binaries that were unnecessary for indexing. After switching to bi, storage dropped to a few GB. No test bundles or app binaries are materialized, and Index Build completes significantly faster since linking is entirely skipped.

Considerations

This is a behavioral change to Index Build's output group selection. Testing confirms it works correctly for both Swift and ObjC modules in a large production codebase, but if a more conservative rollout is preferred, this could be gated behind an xcodeproj rule attribute (e.g., index_build_output_group) to allow projects to opt in. Happy to add that if desired.

@ra1028 ra1028 requested a review from a team as a code owner April 10, 2026 09:39
Introduce a new `bi` (build indexing) output group that contains only
indexstores and their filelist, without linked product binaries. Index
builds use this instead of `bp` to trigger transitive Swift compilation
(producing swiftmodules, indexstores, and generated headers) while
avoiding materialization of large linked binaries.

This reduces remote cache download volume and build time for index
builds, since only compilation artifacts needed for indexing are
requested.

Signed-off-by: Ryo Aoyama <r.fe51028.r@gmail.com>
@ra1028
Copy link
Copy Markdown
Member Author

ra1028 commented Apr 10, 2026

Note: This fix addresses the local build path by eliminating linking during Index Build. For remote cache hits, unnecessary downloads are caused by a regex bug in --remote_download_regex. See #3301 for a complementary fix that corrects the regex.

Copy link
Copy Markdown
Contributor

@brentleyjones brentleyjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might cause SwiftCompile and/or SwiftDeriveFiles to not trigger anymore since indexing is only one side of the action split. When using that mode .o is on the other half of .indexstore, and we need both (I believe). We definitely need to make sure that we are still generating all files needed by indexing.

Another reason I removed the extra output group was to save on starlark memory usage. There was lots of profiling to improve on memory usage and I don't really want to regress.

I'm not blocking this, just stating how this isn't an easy clear win.

@ra1028
Copy link
Copy Markdown
Member Author

ra1028 commented Apr 10, 2026

@brentleyjones

Thanks for the input!

I dug into both points.

On the action split concern, I traced through compiling.bzl in rules_swift and found that .indexstore and .o are actually both outputs of SwiftCompile when split is enabled. SwiftDeriveFiles gets .swiftmodule, -Swift.h, .swiftdoc, etc. So requesting .indexstore via bi still triggers SwiftCompile which produces .o as well. Upstream deps' .swiftmodule files are also produced because SwiftCompile requires transitive .swiftmodule as inputs (_dependencies_swiftmodules_configurator), which forces SwiftDeriveFiles to run transitively. This matches the behavior when bp is requested since bp doesn't directly request .swiftmodule either. Also, split mode isn't enabled by default and rules_xcodeproj doesn't enable it, so this shouldn't be a practical concern either way.

On the memory side, totally fair concern. The actual new cost per target is one memory_efficient_depset wrapper referencing the already-computed transitive_indexstores, and one extra tuple in direct_group_list. transitive_indexstores is already computed and stored in the provider unconditionally, so bi doesn't add new depset computation. BEP-wise, tree artifact expansion only happens when bi is actually requested during Index Build, not during normal builds where depsets stay lazy in OutputGroupInfo.

Happy to gate this behind a flag if you still think it's worth it, though the per-target overhead seems pretty minimal in practice.

@brentleyjones
Copy link
Copy Markdown
Contributor

For the memory concern, each output group added a lot to the memory overhead since we add this to every focused target. Did you profile this with the starlark memory profiler?

@ra1028
Copy link
Copy Markdown
Member Author

ra1028 commented Apr 10, 2026

@brentleyjones

I haven't profiled it yet. I'll try measuring it with bazel dump --skylark_memory and share the results. Is that the way you had in mind?

@brentleyjones
Copy link
Copy Markdown
Contributor

Yes, but also --starlark_cpu_profile since there was a CPU overhead for the output groups as well.

@ra1028
Copy link
Copy Markdown
Member Author

ra1028 commented Apr 10, 2026

@brentleyjones

Profiled it with --skylark_memory (using the allocation instrumenter) and --starlark_cpu_profile. I ran the analysis on the actual xcodeproj generator target (@@rules_xcodeproj++internal+rules_xcodeproj_generated//generator/xcodeproj:xcodeproj) with --nobuild, which configured 129,912 targets. Ran it twice: once with the bi output group patch and once without.

Memory (bazel dump --skylark_memory):

  • Before: 287.5 MB total, output_files.bzl: 19.8 MB (6.9%)
  • After: 282.8 MB total, output_files.bzl: 16.5 MB (5.8%)

CPU (--starlark_cpu_profile):

  • Before: 1.204s total, output_files.bzl: 9.2 ms (0.76%)
  • After: 1.226s total, output_files.bzl: 10.5 ms (0.86%)

Both differences are within sampling noise. The "after" memory is actually slightly lower, and the CPU difference in output_files.bzl is ~1.3 ms across 130k targets.

This makes sense to me since transitive_indexstores is already computed unconditionally. The bi group only adds one memory_efficient_depset wrapper and one tuple entry per target, which is the same marginal cost as any other existing output group entry (like bc or bl).

Does this address your concern, or is there something else you'd like me to look into?

@brentleyjones
Copy link
Copy Markdown
Contributor

I ran the analysis on the actual xcodeproj generator target (@@rules_xcodeproj++internal+rules_xcodeproj_generated//generator/xcodeproj:xcodeproj) with --nobuild

Not sure if it makes a difference, but the form I used before was:

bazel run //:xcodeproj -- 'shutdown'
bazel run //:xcodeproj -- 'build --nobuild $_GENERATOR_LABEL_ --starlark_cpu_profile=/tmp/cpu.pprof.gz'

I am a surprised there isn't a regression here. How large of a project are you testing it on? I recall large savings for each output group I eliminated when testing on a large project.

@ra1028
Copy link
Copy Markdown
Member Author

ra1028 commented Apr 10, 2026

@brentleyjones

The project I tested on is probably not as large as what you used before, but it's a monorepo and the xcodeproj target covers most of the project (some targets are excluded). The analysis configured 129,912 targets across 1,603 packages. I believe it's a fairly large project.

For the profiling approach, I targeted the generator directly via the canonical label with the inner output base (--output_base=.../rules_xcodeproj.noindex/build_output_base), using --config=rules_xcodeproj_generator. Should be equivalent to going through the runner, but I can re-run it your way if you think the runner's environment setup could affect the results.

I think the reason there's no regression here is that bi doesn't compute any new data. transitive_indexstores is already computed unconditionally and stored in the provider regardless of whether bi exists. The bi group just wraps it in one memory_efficient_depset call and adds one tuple to the output group list. If the output groups you eliminated before were computing their own depsets with new transitive data, that would explain the larger savings you saw.

@brentleyjones
Copy link
Copy Markdown
Contributor

It may seem small, but the creation and the existence of the new string for the output group, which has to be held in memory for every target, was part of the cost.

@ra1028
Copy link
Copy Markdown
Member Author

ra1028 commented Apr 10, 2026

@brentleyjones

Thanks for the opinion.
I hear you on the per-target string cost. That said, the problem this solves is pretty significant. Index Build unintentionally triggers linking across all transitive targets, producing linked products that can reach over a few TB of local storage (we do have a way to limit targets included in the xcodeproj, but regenerating for target switches adds development overhead, and there are common cases where many targets need to be included). Given the scale of the problem, the marginal cost of one additional output group entry (same as what bc and bl already pay) seems like a reasonable trade-off.

I also considered modifying bp directly, but adding a separate bi group felt safer since it avoids changing existing output group behavior that other things might depend on.

That said, if you'd prefer to keep the output group count down, I could gate bi behind a flag so it's only registered when needed. Would that work?

@brentleyjones
Copy link
Copy Markdown
Contributor

I just wanted to make sure there weren't major regressions. If this till downloads all files needed for indexing (which bp tried to ensure "bluntly" before), then this should be good. Lets land it now and fix any issues that pop up.

@brentleyjones brentleyjones merged commit 75da5b5 into main Apr 10, 2026
9 of 11 checks passed
@brentleyjones brentleyjones deleted the add-bi-output-group branch April 10, 2026 17:46
@ra1028
Copy link
Copy Markdown
Member Author

ra1028 commented Apr 10, 2026

@brentleyjones

Thanks! Will keep an eye out for any issues and happy to help fix anything that comes up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants