ci(mrbind): cache thirdparty/mrbind/build across all workflows via reusable action#6087
Conversation
The Build MRBind step compiles a small CMake project (45 Ninja targets) that depends only on the pinned `thirdparty/mrbind` submodule and the MSYS2 clang-18 toolchain. Both are stable across most CI runs, but the build is re-done from scratch every time (~3 min wall-clock). Cache `thirdparty/mrbind/build` with a key derived from the submodule SHA, the MSYS2 lockfile, and the install script. On cache hit, skip the Build MRBind step entirely; the existing mrbind.exe / mrbind_gen_c.exe / mrbind_gen_csharp.exe are consumed by the subsequent "Generate C bindings" / "Generate C# bindings" steps as before. Same pattern as the existing msys2_packages cache in .github/actions/install-msys2-mrbind. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply the cache-on-submodule-SHA pattern from the previous commit to
every workflow that builds MRBind:
- build-test-windows.yml (already done; refactored to call the
action)
- build-test-ubuntu-x64.yml
- build-test-ubuntu-arm64.yml
- build-test-macos.yml (split deps install from the build)
- build-test-linux-vcpkg.yml
- pip-build.yml (3 sections: rocky, windows, macos)
- update-docs-manual.yml
The composite action at .github/actions/build-mrbind:
- computes the `thirdparty/mrbind` submodule SHA,
- caches `thirdparty/mrbind/build` with a key derived from the
caller-supplied prefix (platform/toolchain), the submodule SHA, the
hash of the build script, and a caller-supplied extra key,
- runs the build script only on cache miss.
The cache-key-prefix includes the Docker image tag for
container-based jobs (ubuntu, rocky), so the cache invalidates when the
toolchain image changes. On macOS / Windows we use the clang version
file or the MSYS2 lockfile as the toolchain identifier.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The arm64 matrix in build-test-macos.yml has two ARM runners with different homebrew prefixes: matrix.os=github-arm -> macos-latest -> /opt/homebrew matrix.os=arm -> self-hosted ARM -> /Users/runner/.homebrew `runner.arch` is ARM64 on both, so they were sharing a cache entry. The cached mrbind binary embeds the homebrew prefix in its rpath, so restoring a binary built on one runner onto the other fails with `dyld: Library not loaded: /Users/runner/.homebrew/opt/llvm@18/lib/libLLVM.dylib`. Use matrix.os instead. In pip-build.yml's macOS section, use matrix.instance for the same reason. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both build-test-linux-vcpkg.yml and pip-build.yml's rocky section build mrbind inside the same `meshlib/meshlib-rockylinux8-vcpkg-<arch>:<tag>` Docker image using the same `install_mrbind_rockylinux.sh` script. The resulting binaries are identical, but they were using distinct cache key prefixes (`rocky-...` vs `rocky-pip-...`), so each workflow built its own copy. Unify both prefixes to `rockylinux8-vcpkg-<arch>-<tag>` -- matching the Docker image name exactly -- so the two callers share one cache entry. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… pip
Two changes that together let build-test-macos.yml and pip-build.yml
share a single mrbind cache entry per runner image:
1. Pin the arm64 Debug job's runner from `macos-latest` to `macos-14`
so it matches pip-build.yml's pinned `macos-14` runner. The
`latest` label is rolling; if it advances to macos-15-arm while
pip-build still pins macos-14, a binary cached under one would be
restored on the other (and may break, as happened earlier with the
self-hosted runner's homebrew prefix).
2. Add `instance:` to each matrix entry in build-test-macos.yml,
matching the field name pip-build.yml already uses, and key the
cache as `${{ matrix.instance }}-clang` in both. With the pin
above, the two workflows now resolve to identical keys for the two
GitHub-hosted images:
macos-15-intel-clang-<...> (build-test x64 + pip x86)
macos-14-clang-<...> (build-test arm64 Debug + pip arm64)
self-hosted-arm-clang-<...> (build-test arm64 Release, alone)
Saves one mrbind build per CI run and one cache entry (~65 MiB).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The prior commit pinned this job to `macos-14` based on a stale memory that `macos-latest` resolved to `macos-14`. It doesn't -- the migration to `macos-15` finished 2025-09-01 (see GitHub Changelog link in the inline comment), so the original `macos-latest` job was actually running on `macos-15`. Pinning to `macos-14` was a silent downgrade. Pin to `macos-15` instead -- what `macos-latest` resolves to today -- so the runner is explicit and won't roll forward without a code change. This also unpairs the cache from pip-build.yml's arm64, which pins `macos-14` for wheel-compat. Different macOS SDKs shouldn't share the mrbind cache anyway; cross-pollination from a macos-15-built binary to a macos-14 wheel-build job is exactly the kind of subtle breakage we just fixed with the homebrew-prefix discriminator. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…est target" This reverts 80e980b. Sharing the mrbind cache with pip-build.yml's arm64 (which pins macos-14) is worth more than tracking macos-latest's current resolution. Drop the comment that misstated which version `macos-latest` resolves to. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Downstream steps only invoke the three executables -- mrbind,
mrbind_gen_c, mrbind_gen_csharp (see scripts/mrbind/generate.mk:191-
193). The rest of `thirdparty/mrbind/build` (object files, static libs,
CMake/Ninja metadata) is dead weight in the cache because the install
script wipes `build/` before any rebuild, so we never use incremental
state from a restored cache anyway.
Add a step that runs after Build MRBind (cache miss only, before the
post-step cache save):
1. `find ... -exec rm -rf` everything except the three executables.
2. `strip` debug info from those executables. The CMakeLists builds
with RelWithDebInfo + -fstandalone-debug; we don't need that
payload for the cached CI binaries, and a developer hitting a
real mrbind crash can rebuild locally to get full debug info.
Per-platform strip:
- macOS: `strip -S` (drop debug symbols, keep dynamic symbol table).
- Windows: invoke MSYS2 clang64's strip directly (`${MSYS2_DIR}/
clang64/bin/strip.exe`) since Git Bash's PATH doesn't include it.
- Linux: plain `strip`.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| Windows) | ||
| # The build uses MSYS2 clang64; reach for that strip directly since | ||
| # Git Bash's PATH doesn't include it. | ||
| STRIP="${MSYS2_DIR:-C:/msys64}/clang64/bin/strip.exe" |
There was a problem hiding this comment.
Seems like nothing can set MSYS2_DIR at this point?
There was a problem hiding this comment.
Good catch — env vars from the previous step aren't propagated to subsequent composite-action steps, so ${MSYS2_DIR:-...} always fell through to the default. Hard-coded C:/msys64/clang64/bin/strip.exe in 6974556dc.
Three changes in response to #6087 review: 1. Clarify the comment on the `MSYS2_DIR: C:\msys64` env override -- the bat script defaults to `C:\msys64_meshlib_mrbind` (a separate copy for local dev), so the override is needed to point at where `install-msys2-mrbind` actually installs MSYS2 on CI runners. 2. Drop the `${MSYS2_DIR:-...}` indirection in the Windows strip path. Env vars from the Build MRBind step aren't propagated to subsequent composite-action steps, so the expression always fell through to the default; hard-code `C:/msys64/clang64/bin/strip.exe` instead. 3. Split the trim and strip work into two separate steps and gate the strip step on a new `strip-binaries` input (default `'true'`). A debugging branch can pass `strip-binaries: 'false'` to retain useful stack traces from mrbind crashes without editing the action. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| uses: ./.github/actions/build-mrbind | ||
| with: | ||
| # Include the Docker image tag (e.g. glibc, clang install) and clang version file in the cache key. | ||
| cache-key-prefix: ubuntu-${{ matrix.os }}-${{ inputs.docker_image_tag }} |
| path: thirdparty/mrbind/build | ||
| key: mrbind-build-${{ inputs.cache-key-prefix }}-${{ steps.mrbind-sha.outputs.sha }}-${{ hashFiles(inputs.build-script) }}-${{ inputs.extra-cache-key }} | ||
|
|
||
| - name: Build MRBind |
There was a problem hiding this comment.
Maybe better to implement two different jobs for Windows and for Unix with if: ${{ runner.os == 'Windows' }}.
| ! -name 'mrbind_gen_csharp' ! -name 'mrbind_gen_csharp.exe' \ | ||
| -exec rm -rf {} + | ||
|
|
||
| - name: Strip debug info from MRBind binaries before cache save |
There was a problem hiding this comment.
Create different jobs for each platform based on runner.os: https://docs.github.com/en/actions/reference/workflows-and-actions/contexts#runner-context
Address review on #6087: 1. build-test-ubuntu-x64.yml: rename the cache-key-prefix from `ubuntu-<os>-<tag>` to `ubuntu-x64-<os>-<tag>` to match the arm64 workflow's `ubuntu-arm64-<os>-<tag>` shape -- the arch should be explicit in the key. 2. build-mrbind/action.yml: split the single "Build MRBind" step into "Build MRBind (Windows)" and "Build MRBind (Unix)". Only the Windows variant carries the `MSYS2_DIR` env override; previously the env was set on every platform and just happened to be ignored on non-Windows shells. 3. build-mrbind/action.yml: split the `case "$RUNNER_OS"` strip dispatch into three platform-specific steps gated on `runner.os == 'macOS' | 'Linux' | 'Windows'`. Same behaviour as before, idiomatic GHA conditional steps instead of bash branching. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| # `C:\msys64_meshlib_mrbind` (a separate copy intended for local dev) to | ||
| # match where `install-msys2-mrbind` puts MSYS2 on the CI runner. | ||
| if: ${{ steps.cache.outputs.cache-hit != 'true' && runner.os == 'Windows' }} | ||
| shell: ${{ inputs.shell }} |
| shell: ${{ inputs.shell }} | ||
| env: | ||
| MSYS2_DIR: C:\msys64 | ||
| run: ${{ inputs.build-command != '' && inputs.build-command || inputs.build-script }} |
There was a problem hiding this comment.
run: call ${{ inputs.build-script }}
| - name: Build MRBind (Unix) | ||
| if: ${{ steps.cache.outputs.cache-hit != 'true' && runner.os != 'Windows' }} | ||
| shell: ${{ inputs.shell }} | ||
| run: ${{ inputs.build-command != '' && inputs.build-command || inputs.build-script }} |
There was a problem hiding this comment.
run: ${{ inputs.build-script }}
|
|
||
| - name: Build MRBind (Unix) | ||
| if: ${{ steps.cache.outputs.cache-hit != 'true' && runner.os != 'Windows' }} | ||
| shell: ${{ inputs.shell }} |
| # -S drops debug symbols but keeps the dynamic symbol table. | ||
| if: ${{ steps.cache.outputs.cache-hit != 'true' && inputs.strip-binaries == 'true' && runner.os == 'macOS' }} | ||
| shell: bash | ||
| run: strip -S thirdparty/mrbind/build/mrbind thirdparty/mrbind/build/mrbind_gen_c thirdparty/mrbind/build/mrbind_gen_csharp |
There was a problem hiding this comment.
Maybe be to write as strip -S thirdparty/mrbind/build/{mrbind,mrbind_gen_c,mrbind_gen_csharp}
| - name: Strip MRBind binaries (Linux) | ||
| if: ${{ steps.cache.outputs.cache-hit != 'true' && inputs.strip-binaries == 'true' && runner.os == 'Linux' }} | ||
| shell: bash | ||
| run: strip thirdparty/mrbind/build/mrbind thirdparty/mrbind/build/mrbind_gen_c thirdparty/mrbind/build/mrbind_gen_csharp |
There was a problem hiding this comment.
Maybe be to write as strip thirdparty/mrbind/build/{mrbind,mrbind_gen_c,mrbind_gen_csharp}
Per #6087 review: The shell and invocation pattern are determined by the platform, not the caller. Hard-code them in the action: - Windows step: `shell: cmd`, `run: call ${{ inputs.build-script }}` - Unix step: `shell: bash`, `run: ${{ inputs.build-script }}` Drop the `shell` and `build-command` inputs entirely. Update the two Windows callers (build-test-windows.yml, pip-build.yml) -- they were the only ones passing those inputs; every other caller already used defaults. Also use bash brace expansion in the strip steps -- shorter and clearer than spelling out three explicit paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add two optional inputs to build-mrbind so the action can be consumed from a repo where MeshLib is itself a submodule (MeshInspectorCode #7306) or from a runner where MSYS2 isn't at `C:\msys64`: - `mrbind-dir`: path to the mrbind submodule. Default `thirdparty/mrbind` (MeshLib's own layout). Downstream consumers pass e.g. `MeshLib/thirdparty/mrbind`. - `msys2-dir`: Windows-only MSYS2 location. Default `C:\msys64` (where install-msys2-mrbind puts it on CI). Downstream consumers pass `C:\msys64_meshlib_mrbind` for runners pre-baked with the bat script's default layout. All existing MeshLib callers continue to use the defaults; no caller changes needed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `[self-hosted, macos, arm64, build]` label set matches multiple
physical runners with heterogeneous homebrew prefixes -- some have
homebrew at `/Users/runner/.homebrew`, others at `/opt/homebrew`. The
prefix is baked into `mrbind`'s rpath, so a binary built on one runner
fails with `dyld: Library not loaded` when restored on another.
Hash the brew-prefix and append it to the cache key on both macOS
workflows:
- build-test-macos.yml: hash `steps.setup.outputs.brew-prefix`
(computed at runtime from `brew --prefix`).
- pip-build.yml (macOS section): hash `matrix.brewpath` (pinned in
the matrix; runners are GitHub-hosted with stable prefixes).
Both workflows use the same `printf | shasum -a 256 | cut -c1-16`
recipe so the keys still align for runners that actually have the same
prefix (e.g. build-test arm64 Debug on macos-14 + pip-build arm64 on
macos-14 both hash `/opt/homebrew` to the same value and share a
cache entry).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
New composite action
.github/actions/build-mrbindthat cachesthirdparty/mrbind/buildkeyed on the submodule SHA + toolchain + build script, and skips the build entirely on cache hit. Wired into every workflow that builds MRBind:build-test-windows.ymlbuild-test-ubuntu-x64.ymlbuild-test-ubuntu-arm64.ymlbuild-test-macos.ymlbuild-test-linux-vcpkg.ymlpip-build.yml(rocky, windows, macos sections)update-docs-manual.ymlWhy
In a typical Windows run (e.g. job 75535849635), Build MRBind takes ~3 min:
[1/45]at09:20:11→[45/45]at09:23:18. Ninja already builds in parallel — the wall-clock is bounded by a handful of heavy translation units (notablymrbind_gen_csharp/generator.cppis a 36 s straggler that gates the final link). Adding more-jon a 4-core runner doesn't help. But thethirdparty/mrbindsubmodule is pinned and the toolchain is pinned per platform, so re-doing the build every run is pure waste.How the action works
On invocation:
thirdparty/mrbindsubmodule SHA viagit -C thirdparty/mrbind rev-parse HEAD.actions/cache@v5overthirdparty/mrbind/build, key =mrbind-build-<prefix>-<sha>-<scriptHash>-<extra>.build/at the start, so we just don't run it).mrbind,mrbind_gen_c,mrbind_gen_csharp— the only thingsgenerate.mkreferences) andstriptheir debug info. The trimmed/stripped dir is what gets cached.Per-platform cache key strategy
cache-key-prefixextra-cache-key)build-test-windows.yml,pip-build.yml(win)windows-clang18msys2_package_hashes_clang18.txtbuild-test-ubuntu-x64.ymlubuntu-<os>-<docker_image_tag>clang_version.txtbuild-test-ubuntu-arm64.ymlubuntu-arm64-<os>-<docker_image_tag>clang_version.txtbuild-test-linux-vcpkg.yml,pip-build.yml(rocky)rockylinux8-vcpkg-<arch>-<tag>build-test-macos.yml,pip-build.yml(mac)<matrix.instance>-clangclang_version.txtupdate-docs-manual.ymlubuntu-docs-clangclang_version.txtNotes on discrimination:
matrix.instance(the runner image name), notrunner.arch. The two arm64 runner variants —macos-14(GitHub-hosted,/opt/homebrew) and[self-hosted, macos, arm64, build](/Users/runner/.homebrew) — sharerunner.arch=ARM64but have different homebrew prefixes baked into themrbindbinary's rpath, so they must not share a cache (this caused adyld: Library not loadedfailure when first wired up).macos-latestwas replaced with an explicitmacos-14for the arm64 Debug job, matchingpip-build.yml's pin, so the two workflows share one cache entry.Cache sharing (intentional)
Where two workflows produce byte-identical binaries (same Docker image, same script, same submodule), the keys are unified so they share a single cache entry:
rockylinux8-vcpkg-x64-latest↔ build-test-linux-vcpkg x64 + pip-build x86_64rockylinux8-vcpkg-arm64-latest↔ build-test-linux-vcpkg arm64 + pip-build aarch64macos-15-intel-clang↔ build-test-macos x64 + pip-build x86macos-14-clang↔ build-test-macos arm64 Debug + pip-build arm64Trimming + stripping
thirdparty/mrbind/buildpost-build contains 3 executables, 3 intermediate static libs, 32.o[bj]files, full CMakeFiles tree, and Ninja metadata. Only the executables are consumed downstream (generate.mk:191-193). After Build MRBind succeeds, the actionfind-deletes everything except the 3 binaries andstrips them (strip -Son macOS, MSYS2strip.exeon Windows, plainstripelsewhere). Devs needing full debug info can rebuild locally; the cached binaries are for downstream consumption only.Results
Wall-clock per matrix entry (master vs PR on cache hit):
Cache footprint: 102 MiB across 10 entries (peaked at ~1369 MiB / 18 entries during PR iteration before unification and trimming).
windows-clang18ubuntu-{ubuntu22,ubuntu24}-latestubuntu-arm64-{ubuntu22,ubuntu24}-latestrockylinux8-vcpkg-{x64,arm64}-latestmacos-15-intel-clangmacos-14-clangself-hosted-arm-clang(Rocky entries are an order of magnitude smaller because that toolchain exposes
clang-cpp+LLVMas a single shared lib, somrbindends up dynamically linked — vs Ubuntu'sclangTooling-based static linking. Different binary shape, same source.)Test plan
thirdparty/mrbindcauses cache miss → full rebuild.update-docs-manual.ymlis workflow-dispatch only — not exercised by CI on this PR.