ci: change runner to solx-linux-amd64-self-hosted#375
Conversation
047336d to
38015a7
Compare
|
📊 solx Tester Report ➡️ Download |
|
📊 Hardhat Projects Report ➡️ Download |
|
📊 Foundry Projects Report ➡️ Download |
Coverage Summary
|
4b3dfaf to
194d2f6
Compare
There was a problem hiding this comment.
Pull request overview
This PR updates CI workflows that run on the solx Linux AMD64 runner to use a persistent self-hosted runner, and adds/refactors build-artifact sharing between the PR checkout and a secondary temp-solx-main checkout to reduce redundant builds while avoiding stale reuse.
Changes:
- Switch
runs-onfromsolx-linux-amd64tosolx-linux-amd64-self-hostedin the sanitizer, coverage, and integration workflows. - In
integration-tests.yaml, refactor LLVM sharing to an output-driven “share then conditional build” pattern, and add an equivalent solc sharing step via symlinks when submodule SHAs match. - Add cleanup (
rm -rfbeforeln -snf) in sharing steps to prevent stale paths/symlinks from being reused on a persistent workspace.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| .github/workflows/sanitizer.yaml | Switch sanitizer workflow to the persistent self-hosted runner label. |
| .github/workflows/coverage.yaml | Switch coverage workflow to the persistent self-hosted runner label. |
| .github/workflows/integration-tests.yaml | Switch to self-hosted runner; refactor LLVM sharing/build flow; add solc sharing/build flow with stale-path cleanup. |
hedgar2017
left a comment
There was a problem hiding this comment.
@nebasuke — stepping back from the per-line comments for a moment, the three problems flagged here (the two stale-symlink fallback paths, plus the build-script-hash dimension that share-solc misses) all come from the same place: the workflow is reimplementing a SHA-keyed mirror in bash, in parallel to the cache-key logic the build-llvm and build-solc composite actions already own.
Worth considering folding the mirror in as a mirror-from input on those actions instead. The workflow then collapses to two ordinary uses: calls per artifact:
- name: Build LLVM (PR)
uses: ./.github/actions/build-llvm
with: { build-type: Release, enable-assertions: 'false' }
- name: Build LLVM (main)
uses: ./.github/actions/build-llvm
continue-on-error: true
with:
working-dir: temp-solx-main
mirror-from: '.'
build-type: Release
enable-assertions: 'false'
- name: Build solc (PR)
uses: ./.github/actions/build-solc
- name: Build solc (main)
uses: ./.github/actions/build-solc
continue-on-error: true
with:
working-dir: temp-solx-main/solx-solidity
mirror-from: solx-solidityInside the action, a single new step gated on inputs.mirror-from != '', falling through to the existing cache-restore + build chain when it can't mirror:
- name: Mirror artifact from sibling checkout
id: mirror
if: inputs.mirror-from != ''
shell: bash
run: |
SRC="${{ inputs.mirror-from }}/target-llvm/target-final"
DST="${{ inputs.working-dir }}/target-llvm/target-final"
SRC_KEY=$(<compute the same cache key, with mirror-from's SHAs>)
[ "$SRC_KEY" = "${{ steps.llvm-cache.outputs.key }}" ] && [ -d "$SRC" ] || {
echo "mirrored=false" >> "$GITHUB_OUTPUT"; exit 0
}
rm -rf "$DST"
mkdir -p "$(dirname "$DST")"
cp -al "$SRC" "$DST"
echo "mirrored=true" >> "$GITHUB_OUTPUT"Existing build steps' if: extends to ... && steps.mirror.outputs.mirrored != 'true'.
Two things this gets that the surgical fixes won't:
- Cleanup is unconditional and adjacent to the write —
rm -rf "$DST"sits one line abovecp -al, so it can't drift out of the SHA-match branch the way the currentrm -rfdid. Submodule-internal stale state and Copilot's broader "staletemp-solx-main" concern collapse into the same fix. - The mirror predicate inherits the action's full cache-key dimensions for free — boost version, MLIR flag, build-script hash, sanitizer — so the share decision can't drift from the cache decision as keys grow.
Also cp -al instead of ln -snf. Hardlinks under cmake/cargo's rename-on-write conventions don't have the through-write surface symlinks do, and the consumers here are read-only.
Bigger than a surgical fix on this PR — happy to take it as a follow-up if you'd rather close #375 with the rm -rf cleanup and the script-hash check in the share predicate (per the inline comments).
b975657 to
0be720f
Compare
|
Thanks — going with the surgical path on this PR. Filed the What landed here (path B):
Why not path A on this PR: the architectural win (mirror predicate inheriting the action's cache-key dimensions for free, hardlinks instead of symlinks, cleanup adjacent to the write) is real and worth doing, but it rewrites the composite actions — bigger blast radius than the three defects this PR is fixing. Easier to review the surgical patch against the inline comments and ship that, then refactor cleanly under #448. |
0be720f to
a1b1ec1
Compare
Single-job probe on solx-linux-amd64-self-hosted with no container and no checkout, so we can read user/id, workspace path ownership, and mount info. Triggered only when this file is touched. Temporary; remove once the runner is reachable.
Mirror the existing 'Share LLVM with main checkout' pattern for solc. When the solx-solidity SHA on the PR matches main (the typical case for PRs that don't bump the submodule), symlink solx-solidity/build and solx-solidity/boost from the PR checkout into temp-solx-main and skip the redundant Build solc (main) step. Observed cost saving: ~2 minutes per integration-tests run, where Build solc (main) was hitting the artifact cache differently than Build solc (PR) and falling through to a ccache-assisted rebuild even with identical cache keys.
Remove the diagnostic probe workflow now that the self-hosted runner is validated, refactor the LLVM share step to the same output+skip pattern used by the new solc share step, and clear stale dirs/symlinks before relinking so a persistent self-hosted workspace can't reuse a previous run's main-build output.
actions/checkout's `git clean -ffdx` doesn't recurse into submodules, so
on the persistent self-hosted workspace a prior run can leave stale
solx-solidity/{build,boost} dirs or symlinks (pointing at the PR's
freshly-built artifacts) inside temp-solx-main. Subsequent SHA-mismatch
runs would extract through those symlinks via `actions/cache/restore`,
silently corrupting the PR's build output and skewing benchmarks.
Wholesale `rm -rf temp-solx-main` before the optional checkout closes
this and the symmetric stale-checkout concern in one stroke.
build-solc/action.yml keys its artifact cache on submodule SHA *plus* sha256 of `solx-dev/src/solc`. The share predicate matched only on SHA, so a PR that edits `solx-dev/src/solc/*` without bumping the solx-solidity submodule pin would silently share its own solc binary as the "main" baseline, skewing the benchmark with no signal. Compute the same script hash on both sides and require both checks to pass before symlinking. Falling back to a real `Build solc (main)` is strictly better than a silent share when inputs diverge.
Mirror the share-solc fix: `solx-dev llvm build` is driven by `solx-dev/src/llvm/*`, so a PR that edits those scripts without bumping the solx-llvm submodule pin would silently share its own LLVM build as the "main" baseline. Compute and compare the script hash on both sides before symlinking, falling back to a real LLVM build for main when they diverge.
Pull the dense `find | sort -z | xargs -0 cat | sha256sum | cut` pipeline into a named shell function. The hash is computed twice per step (PR vs main) and the original one-liner obscured what was being compared.
a1b1ec1 to
283cb6a
Compare
solx-linux-amd64(per-run) tosolx-linux-amd64-self-hosted(persistent) in the three workflows that target it:integration-tests.yaml,coverage.yaml,sanitizer.yaml.integration-tests.yaml, share the solc build between the PR checkout and thetemp-solx-maincheckout via symlink when their solc submodule SHAs match — same pattern as the existing LLVM-sharing step.steps.share-llvm.outputs.shared) as the new solc step instead of building inline, so the two share/build pairs are symmetric.rm -rfbeforeln -snf) in both share steps so a previous run's main-build output on the persistent self-hosted workspace can't be reused.Runs
Integration tests: