Skip to content

ci(mrbind): cache thirdparty/mrbind/build across all workflows via reusable action#6087

Merged
Fedr merged 16 commits into
masterfrom
ci/cache-mrbind-build
May 13, 2026
Merged

ci(mrbind): cache thirdparty/mrbind/build across all workflows via reusable action#6087
Fedr merged 16 commits into
masterfrom
ci/cache-mrbind-build

Conversation

@Fedr
Copy link
Copy Markdown
Contributor

@Fedr Fedr commented May 12, 2026

Summary

New composite action .github/actions/build-mrbind that caches thirdparty/mrbind/build keyed on the submodule SHA + toolchain + build script, and skips the build entirely on cache hit. Wired into every workflow that builds MRBind:

  • build-test-windows.yml
  • build-test-ubuntu-x64.yml
  • build-test-ubuntu-arm64.yml
  • build-test-macos.yml
  • build-test-linux-vcpkg.yml
  • pip-build.yml (rocky, windows, macos sections)
  • update-docs-manual.yml

Why

In a typical Windows run (e.g. job 75535849635), Build MRBind takes ~3 min: [1/45] at 09:20:11[45/45] at 09:23:18. Ninja already builds in parallel — the wall-clock is bounded by a handful of heavy translation units (notably mrbind_gen_csharp/generator.cpp is a 36 s straggler that gates the final link). Adding more -j on a 4-core runner doesn't help. But the thirdparty/mrbind submodule is pinned and the toolchain is pinned per platform, so re-doing the build every run is pure waste.

How the action works

- uses: ./.github/actions/build-mrbind
  with:
    cache-key-prefix: <platform/toolchain identifier>
    build-script:     scripts/mrbind/install_mrbind_<platform>.{sh,bat}
    shell:            cmd | bash         # bash by default
    build-command:    <optional override, e.g. "call ./<script>" on Windows>
    extra-cache-key:  ${{ hashFiles(...) }}   # toolchain version file(s)

On invocation:

  1. Compute the thirdparty/mrbind submodule SHA via git -C thirdparty/mrbind rev-parse HEAD.
  2. actions/cache@v5 over thirdparty/mrbind/build, key = mrbind-build-<prefix>-<sha>-<scriptHash>-<extra>.
  3. If cache hit: skip the build (the install script wipes build/ at the start, so we just don't run it).
  4. If cache miss: run the build, then trim the build dir to just the three executables (mrbind, mrbind_gen_c, mrbind_gen_csharp — the only things generate.mk references) and strip their debug info. The trimmed/stripped dir is what gets cached.

Per-platform cache key strategy

Workflow cache-key-prefix Toolchain identifier (extra-cache-key)
build-test-windows.yml, pip-build.yml (win) windows-clang18 msys2_package_hashes_clang18.txt
build-test-ubuntu-x64.yml ubuntu-<os>-<docker_image_tag> clang_version.txt
build-test-ubuntu-arm64.yml ubuntu-arm64-<os>-<docker_image_tag> clang_version.txt
build-test-linux-vcpkg.yml, pip-build.yml (rocky) rockylinux8-vcpkg-<arch>-<tag> (image tag is the identifier)
build-test-macos.yml, pip-build.yml (mac) <matrix.instance>-clang clang_version.txt
update-docs-manual.yml ubuntu-docs-clang clang_version.txt

Notes on discrimination:

  • Docker-based jobs (ubuntu, rocky) include the image tag so the cache invalidates on toolchain updates.
  • macOS keys on matrix.instance (the runner image name), not runner.arch. The two arm64 runner variants — macos-14 (GitHub-hosted, /opt/homebrew) and [self-hosted, macos, arm64, build] (/Users/runner/.homebrew) — share runner.arch=ARM64 but have different homebrew prefixes baked into the mrbind binary's rpath, so they must not share a cache (this caused a dyld: Library not loaded failure when first wired up).
  • macos-latest was replaced with an explicit macos-14 for the arm64 Debug job, matching pip-build.yml's pin, so the two workflows share one cache entry.

Cache sharing (intentional)

Where two workflows produce byte-identical binaries (same Docker image, same script, same submodule), the keys are unified so they share a single cache entry:

  • rockylinux8-vcpkg-x64-latest ↔ build-test-linux-vcpkg x64 + pip-build x86_64
  • rockylinux8-vcpkg-arm64-latest ↔ build-test-linux-vcpkg arm64 + pip-build aarch64
  • macos-15-intel-clang ↔ build-test-macos x64 + pip-build x86
  • macos-14-clang ↔ build-test-macos arm64 Debug + pip-build arm64

Trimming + stripping

thirdparty/mrbind/build post-build contains 3 executables, 3 intermediate static libs, 32 .o[bj] files, full CMakeFiles tree, and Ninja metadata. Only the executables are consumed downstream (generate.mk:191-193). After Build MRBind succeeds, the action find-deletes everything except the 3 binaries and strips them (strip -S on macOS, MSYS2 strip.exe on Windows, plain strip elsewhere). Devs needing full debug info can rebuild locally; the cached binaries are for downstream consumption only.

Results

Wall-clock per matrix entry (master vs PR on cache hit):

Platform Master PR (cache hit) Reduction
windows ~190 s (med) ~3 s -98%
ubuntu-x64 ~185 s ~5 s -97%
ubuntu-arm64 ~140 s ~4 s -97%
linux-vcpkg ~151 s ~3 s -98%
macOS 69–477 s ~3 s ~-95%

Cache footprint: 102 MiB across 10 entries (peaked at ~1369 MiB / 18 entries during PR iteration before unification and trimming).

Cache Size
windows-clang18 11.5 MiB
ubuntu-{ubuntu22,ubuntu24}-latest 12.4 MiB each
ubuntu-arm64-{ubuntu22,ubuntu24}-latest ~12 MiB each
rockylinux8-vcpkg-{x64,arm64}-latest ~2 MiB each
macos-15-intel-clang 13.0 MiB
macos-14-clang 12.5 MiB
self-hosted-arm-clang 12.3 MiB

(Rocky entries are an order of magnitude smaller because that toolchain exposes clang-cpp + LLVM as a single shared lib, so mrbind ends up dynamically linked — vs Ubuntu's clangTooling-based static linking. Different binary shape, same source.)

Test plan

  • Cold-cache run populates each platform's cache; downstream Generate/Build/test steps succeed.
  • Warm-cache run skips Build MRBind on every matrix entry; downstream still succeeds.
  • Bumping thirdparty/mrbind causes cache miss → full rebuild.
  • macOS arm64 runners with different homebrew prefixes don't cross-contaminate (caught and fixed mid-PR).
  • update-docs-manual.yml is workflow-dispatch only — not exercised by CI on this PR.

The Build MRBind step compiles a small CMake project (45 Ninja targets)
that depends only on the pinned `thirdparty/mrbind` submodule and the
MSYS2 clang-18 toolchain. Both are stable across most CI runs, but the
build is re-done from scratch every time (~3 min wall-clock).

Cache `thirdparty/mrbind/build` with a key derived from the submodule
SHA, the MSYS2 lockfile, and the install script. On cache hit, skip the
Build MRBind step entirely; the existing mrbind.exe / mrbind_gen_c.exe /
mrbind_gen_csharp.exe are consumed by the subsequent "Generate C
bindings" / "Generate C# bindings" steps as before.

Same pattern as the existing msys2_packages cache in
.github/actions/install-msys2-mrbind.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply the cache-on-submodule-SHA pattern from the previous commit to
every workflow that builds MRBind:

- build-test-windows.yml      (already done; refactored to call the
                               action)
- build-test-ubuntu-x64.yml
- build-test-ubuntu-arm64.yml
- build-test-macos.yml        (split deps install from the build)
- build-test-linux-vcpkg.yml
- pip-build.yml               (3 sections: rocky, windows, macos)
- update-docs-manual.yml

The composite action at .github/actions/build-mrbind:
- computes the `thirdparty/mrbind` submodule SHA,
- caches `thirdparty/mrbind/build` with a key derived from the
  caller-supplied prefix (platform/toolchain), the submodule SHA, the
  hash of the build script, and a caller-supplied extra key,
- runs the build script only on cache miss.

The cache-key-prefix includes the Docker image tag for
container-based jobs (ubuntu, rocky), so the cache invalidates when the
toolchain image changes. On macOS / Windows we use the clang version
file or the MSYS2 lockfile as the toolchain identifier.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Fedr Fedr changed the title ci(windows): cache mrbind/build keyed on submodule SHA ci(mrbind): cache thirdparty/mrbind/build across all workflows via reusable action May 12, 2026
Fedr and others added 7 commits May 12, 2026 19:51
The arm64 matrix in build-test-macos.yml has two ARM runners with
different homebrew prefixes:

  matrix.os=github-arm  -> macos-latest      -> /opt/homebrew
  matrix.os=arm         -> self-hosted ARM   -> /Users/runner/.homebrew

`runner.arch` is ARM64 on both, so they were sharing a cache entry.
The cached mrbind binary embeds the homebrew prefix in its rpath, so
restoring a binary built on one runner onto the other fails with
`dyld: Library not loaded: /Users/runner/.homebrew/opt/llvm@18/lib/libLLVM.dylib`.

Use matrix.os instead. In pip-build.yml's macOS section, use
matrix.instance for the same reason.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Both build-test-linux-vcpkg.yml and pip-build.yml's rocky section build
mrbind inside the same `meshlib/meshlib-rockylinux8-vcpkg-<arch>:<tag>`
Docker image using the same `install_mrbind_rockylinux.sh` script. The
resulting binaries are identical, but they were using distinct cache
key prefixes (`rocky-...` vs `rocky-pip-...`), so each workflow built
its own copy.

Unify both prefixes to `rockylinux8-vcpkg-<arch>-<tag>` -- matching the
Docker image name exactly -- so the two callers share one cache entry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… pip

Two changes that together let build-test-macos.yml and pip-build.yml
share a single mrbind cache entry per runner image:

1. Pin the arm64 Debug job's runner from `macos-latest` to `macos-14`
   so it matches pip-build.yml's pinned `macos-14` runner. The
   `latest` label is rolling; if it advances to macos-15-arm while
   pip-build still pins macos-14, a binary cached under one would be
   restored on the other (and may break, as happened earlier with the
   self-hosted runner's homebrew prefix).

2. Add `instance:` to each matrix entry in build-test-macos.yml,
   matching the field name pip-build.yml already uses, and key the
   cache as `${{ matrix.instance }}-clang` in both. With the pin
   above, the two workflows now resolve to identical keys for the two
   GitHub-hosted images:

     macos-15-intel-clang-<...>   (build-test x64  +  pip x86)
     macos-14-clang-<...>         (build-test arm64 Debug  +  pip arm64)
     self-hosted-arm-clang-<...>  (build-test arm64 Release, alone)

Saves one mrbind build per CI run and one cache entry (~65 MiB).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The prior commit pinned this job to `macos-14` based on a stale memory
that `macos-latest` resolved to `macos-14`. It doesn't -- the migration
to `macos-15` finished 2025-09-01 (see GitHub Changelog link in the
inline comment), so the original `macos-latest` job was actually running
on `macos-15`. Pinning to `macos-14` was a silent downgrade.

Pin to `macos-15` instead -- what `macos-latest` resolves to today -- so
the runner is explicit and won't roll forward without a code change.

This also unpairs the cache from pip-build.yml's arm64, which pins
`macos-14` for wheel-compat. Different macOS SDKs shouldn't share the
mrbind cache anyway; cross-pollination from a macos-15-built binary to
a macos-14 wheel-build job is exactly the kind of subtle breakage we
just fixed with the homebrew-prefix discriminator.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…est target"

This reverts 80e980b. Sharing the mrbind cache with pip-build.yml's
arm64 (which pins macos-14) is worth more than tracking macos-latest's
current resolution. Drop the comment that misstated which version
`macos-latest` resolves to.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Downstream steps only invoke the three executables -- mrbind,
mrbind_gen_c, mrbind_gen_csharp (see scripts/mrbind/generate.mk:191-
193). The rest of `thirdparty/mrbind/build` (object files, static libs,
CMake/Ninja metadata) is dead weight in the cache because the install
script wipes `build/` before any rebuild, so we never use incremental
state from a restored cache anyway.

Add a step that runs after Build MRBind (cache miss only, before the
post-step cache save):

  1. `find ... -exec rm -rf` everything except the three executables.
  2. `strip` debug info from those executables. The CMakeLists builds
     with RelWithDebInfo + -fstandalone-debug; we don't need that
     payload for the cached CI binaries, and a developer hitting a
     real mrbind crash can rebuild locally to get full debug info.

Per-platform strip:
  - macOS: `strip -S` (drop debug symbols, keep dynamic symbol table).
  - Windows: invoke MSYS2 clang64's strip directly (`${MSYS2_DIR}/
    clang64/bin/strip.exe`) since Git Bash's PATH doesn't include it.
  - Linux: plain `strip`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread .github/actions/build-mrbind/action.yml Outdated
Comment thread .github/actions/build-mrbind/action.yml Outdated
Windows)
# The build uses MSYS2 clang64; reach for that strip directly since
# Git Bash's PATH doesn't include it.
STRIP="${MSYS2_DIR:-C:/msys64}/clang64/bin/strip.exe"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like nothing can set MSYS2_DIR at this point?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — env vars from the previous step aren't propagated to subsequent composite-action steps, so ${MSYS2_DIR:-...} always fell through to the default. Hard-coded C:/msys64/clang64/bin/strip.exe in 6974556dc.

Comment thread .github/actions/build-mrbind/action.yml
Fedr and others added 2 commits May 13, 2026 13:18
Three changes in response to #6087 review:

1. Clarify the comment on the `MSYS2_DIR: C:\msys64` env override --
   the bat script defaults to `C:\msys64_meshlib_mrbind` (a separate
   copy for local dev), so the override is needed to point at where
   `install-msys2-mrbind` actually installs MSYS2 on CI runners.

2. Drop the `${MSYS2_DIR:-...}` indirection in the Windows strip
   path. Env vars from the Build MRBind step aren't propagated to
   subsequent composite-action steps, so the expression always fell
   through to the default; hard-code `C:/msys64/clang64/bin/strip.exe`
   instead.

3. Split the trim and strip work into two separate steps and gate the
   strip step on a new `strip-binaries` input (default `'true'`).
   A debugging branch can pass `strip-binaries: 'false'` to retain
   useful stack traces from mrbind crashes without editing the action.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@oitel oitel self-requested a review May 13, 2026 11:00
@Fedr Fedr requested review from oitel and removed request for oitel May 13, 2026 11:43
uses: ./.github/actions/build-mrbind
with:
# Include the Docker image tag (e.g. glibc, clang install) and clang version file in the cache key.
cache-key-prefix: ubuntu-${{ matrix.os }}-${{ inputs.docker_image_tag }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ubuntu-x64

Comment thread .github/actions/build-mrbind/action.yml Outdated
path: thirdparty/mrbind/build
key: mrbind-build-${{ inputs.cache-key-prefix }}-${{ steps.mrbind-sha.outputs.sha }}-${{ hashFiles(inputs.build-script) }}-${{ inputs.extra-cache-key }}

- name: Build MRBind
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better to implement two different jobs for Windows and for Unix with if: ${{ runner.os == 'Windows' }}.

Comment thread .github/actions/build-mrbind/action.yml Outdated
! -name 'mrbind_gen_csharp' ! -name 'mrbind_gen_csharp.exe' \
-exec rm -rf {} +

- name: Strip debug info from MRBind binaries before cache save
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create different jobs for each platform based on runner.os: https://docs.github.com/en/actions/reference/workflows-and-actions/contexts#runner-context

Address review on #6087:

1. build-test-ubuntu-x64.yml: rename the cache-key-prefix from
   `ubuntu-<os>-<tag>` to `ubuntu-x64-<os>-<tag>` to match the arm64
   workflow's `ubuntu-arm64-<os>-<tag>` shape -- the arch should be
   explicit in the key.

2. build-mrbind/action.yml: split the single "Build MRBind" step into
   "Build MRBind (Windows)" and "Build MRBind (Unix)". Only the Windows
   variant carries the `MSYS2_DIR` env override; previously the env
   was set on every platform and just happened to be ignored on
   non-Windows shells.

3. build-mrbind/action.yml: split the `case "$RUNNER_OS"` strip
   dispatch into three platform-specific steps gated on
   `runner.os == 'macOS' | 'Linux' | 'Windows'`. Same behaviour as
   before, idiomatic GHA conditional steps instead of bash branching.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread .github/actions/build-mrbind/action.yml Outdated
# `C:\msys64_meshlib_mrbind` (a separate copy intended for local dev) to
# match where `install-msys2-mrbind` puts MSYS2 on the CI runner.
if: ${{ steps.cache.outputs.cache-hit != 'true' && runner.os == 'Windows' }}
shell: ${{ inputs.shell }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shell: cmd

Comment thread .github/actions/build-mrbind/action.yml Outdated
shell: ${{ inputs.shell }}
env:
MSYS2_DIR: C:\msys64
run: ${{ inputs.build-command != '' && inputs.build-command || inputs.build-script }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run: call ${{ inputs.build-script }}

Comment thread .github/actions/build-mrbind/action.yml Outdated
- name: Build MRBind (Unix)
if: ${{ steps.cache.outputs.cache-hit != 'true' && runner.os != 'Windows' }}
shell: ${{ inputs.shell }}
run: ${{ inputs.build-command != '' && inputs.build-command || inputs.build-script }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

run: ${{ inputs.build-script }}

Comment thread .github/actions/build-mrbind/action.yml Outdated

- name: Build MRBind (Unix)
if: ${{ steps.cache.outputs.cache-hit != 'true' && runner.os != 'Windows' }}
shell: ${{ inputs.shell }}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shell: bash

Comment thread .github/actions/build-mrbind/action.yml Outdated
# -S drops debug symbols but keeps the dynamic symbol table.
if: ${{ steps.cache.outputs.cache-hit != 'true' && inputs.strip-binaries == 'true' && runner.os == 'macOS' }}
shell: bash
run: strip -S thirdparty/mrbind/build/mrbind thirdparty/mrbind/build/mrbind_gen_c thirdparty/mrbind/build/mrbind_gen_csharp
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe be to write as strip -S thirdparty/mrbind/build/{mrbind,mrbind_gen_c,mrbind_gen_csharp}

Comment thread .github/actions/build-mrbind/action.yml Outdated
- name: Strip MRBind binaries (Linux)
if: ${{ steps.cache.outputs.cache-hit != 'true' && inputs.strip-binaries == 'true' && runner.os == 'Linux' }}
shell: bash
run: strip thirdparty/mrbind/build/mrbind thirdparty/mrbind/build/mrbind_gen_c thirdparty/mrbind/build/mrbind_gen_csharp
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe be to write as strip thirdparty/mrbind/build/{mrbind,mrbind_gen_c,mrbind_gen_csharp}

Per #6087 review:

The shell and invocation pattern are determined by the platform, not
the caller. Hard-code them in the action:

  - Windows step: `shell: cmd`, `run: call ${{ inputs.build-script }}`
  - Unix step:    `shell: bash`, `run: ${{ inputs.build-script }}`

Drop the `shell` and `build-command` inputs entirely. Update the two
Windows callers (build-test-windows.yml, pip-build.yml) -- they were
the only ones passing those inputs; every other caller already used
defaults.

Also use bash brace expansion in the strip steps -- shorter and clearer
than spelling out three explicit paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fedr and others added 3 commits May 13, 2026 17:21
Add two optional inputs to build-mrbind so the action can be consumed
from a repo where MeshLib is itself a submodule (MeshInspectorCode
#7306) or from a runner where MSYS2 isn't at `C:\msys64`:

- `mrbind-dir`: path to the mrbind submodule. Default
  `thirdparty/mrbind` (MeshLib's own layout). Downstream consumers pass
  e.g. `MeshLib/thirdparty/mrbind`.
- `msys2-dir`: Windows-only MSYS2 location. Default `C:\msys64` (where
  install-msys2-mrbind puts it on CI). Downstream consumers pass
  `C:\msys64_meshlib_mrbind` for runners pre-baked with the bat
  script's default layout.

All existing MeshLib callers continue to use the defaults; no caller
changes needed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The `[self-hosted, macos, arm64, build]` label set matches multiple
physical runners with heterogeneous homebrew prefixes -- some have
homebrew at `/Users/runner/.homebrew`, others at `/opt/homebrew`. The
prefix is baked into `mrbind`'s rpath, so a binary built on one runner
fails with `dyld: Library not loaded` when restored on another.

Hash the brew-prefix and append it to the cache key on both macOS
workflows:

  - build-test-macos.yml: hash `steps.setup.outputs.brew-prefix`
    (computed at runtime from `brew --prefix`).
  - pip-build.yml (macOS section): hash `matrix.brewpath` (pinned in
    the matrix; runners are GitHub-hosted with stable prefixes).

Both workflows use the same `printf | shasum -a 256 | cut -c1-16`
recipe so the keys still align for runners that actually have the same
prefix (e.g. build-test arm64 Debug on macos-14 + pip-build arm64 on
macos-14 both hash `/opt/homebrew` to the same value and share a
cache entry).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Fedr Fedr merged commit 2ab9b92 into master May 13, 2026
100 checks passed
@Fedr Fedr deleted the ci/cache-mrbind-build branch May 13, 2026 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

disable-build-emscripten full-ci run all steps test-pip-build Build Python wheels (and discard them)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants