fix(elfpatch): swap patchelf op order to --set-rpath then --set-interpreter#13
Merged
Sunrisepeak merged 1 commit intomainfrom May 2, 2026
Merged
Conversation
…preter
patchelf 0.18.0 corrupts compact ELFs (e.g. ninja 1.12.1 at 273 KB)
when --set-interpreter runs before --set-rpath. The interp op extends
PT_LOAD and shifts the dynamic section; the subsequent rpath op
operates on stale offsets, writes through DT_NEEDED, and the loader
segfaults at execve+1 with no recovery path (--shrink-rpath /
re-set-rpath also fail on the corrupted binary).
Reproduction matrix (patchelf 0.18.0 + ninja 1.12.1, fresh from
upstream):
T1 --set-interpreter only ✓
T2 --set-rpath only ✓
T3 rpath then interp (two calls) ✓
T4 interp then rpath (two calls) ✗ SEGFAULT @ execve+1
T5 one cmd: --set-interpreter --set-rpath ✗ SEGFAULT
T6 one cmd: --set-rpath --set-interpreter ✗ SEGFAULT
T7 same wrong order on openssl (~700 KB) ✓ (large bin survives)
T8 recovery via --shrink-rpath / re-set ✗ unrecoverable
Insight: T5 vs T6 — patchelf processes interp before rpath
INTERNALLY regardless of CLI argument order, so combined-command form
is equivalent to T4. The only safe ordering is two separate
invocations with rpath first.
Fix: reverse op order in `_patch_elf_executables` (declarative bins
mode) and `_patch_elf` fallback (full-scan mode).
Why reverse-order is safe:
- rpath op only mutates the dynamic section (DT_RUNPATH entry);
program headers stay intact
- interp op then extends .interp / appends PT_LOAD, but no patchelf
op runs after it — so stale offsets don't matter
Cross-platform scope:
- Linux: this fix
- macOS: unaffected (install_name_tool, no equivalent op-order issue)
- Windows: unaffected (no patchelf path)
Regression test: ApplyElfpatchAuto_LinuxRpathBeforeInterpreter asserts
log order via stub patchelf shell script. Catches future accidental
reverts.
Reported by: mcpp sandbox bring-up — initially traced as "ninja install
in 0.4.12 sandbox produces segfaulting binary". Earlier diagnosis
hypothesized double-patch corruption, but characterization narrowed it
to single-pass order sensitivity in patchelf 0.18.0 itself.
Full analysis: docs/plans/2026-05-03-patchelf-order-bug-analysis.md
Member
Author
|
Linux CI failure here is the same bug this PR fixes, now triggered industry-wide because xim-pkgindex#108 (merged ~30min ago) added |
3 tasks
Sunrisepeak
added a commit
to openxlings/xlings
that referenced
this pull request
May 2, 2026
* release: 0.4.13 — fix(elfpatch): patchelf op order workaround
Bumps VERSION 0.4.12 → 0.4.13 + add_requires("mcpplibs-xpkg 0.0.37").
Hotfix for a patchelf 0.18.0 corruption bug that surfaced production-
wide after xim-pkgindex#108 declared `runtime = { glibc, gcc }` deps
on five common prebuilt packages (cmake, mdbook, ninja, node, xmake).
Predicate-driven elfpatch then fired on these binaries; for compact
ELFs (~≤ 280 KB, e.g. ninja 1.12.1 at 273 KB) patchelf 0.18.0 has an
order-sensitive corruption bug: --set-interpreter before --set-rpath
shifts the dynamic section so the subsequent rpath op writes through
DT_NEEDED, segfaulting the binary at execve+1 with no recovery.
mcpplibs-xpkg v0.0.37 reverses the op order in
_patch_elf_executables and _patch_elf fallback so --set-rpath runs
first. Linux-only fix; macOS / Windows unaffected (different
toolchains, no equivalent issue).
Symptom on 0.4.12:
$ xlings install ninja
$ ninja --version
Segmentation fault (core dumped)
# OR
$ cmake ... # cmake driving small build tools
failed(-1) — child process killed by signal
Symptom on 0.4.13 (verified locally with isolated XLINGS_HOME):
$ xlings install ninja
elfpatch._apply: source=predicate:single
ninja: elfpatch auto: 1 1 0
$ ninja --version
1.12.1
INTERP=…/xim-x-glibc/2.39/lib64/ld-linux-x86-64.so.2
RUNPATH=…/xim-x-glibc/lib64:…/xim-x-gcc/lib64:…/subos/default/lib
Migration: 0.4.12 → 0.4.13 is binary-compatible. The in-place
self-update path (xself install / quick_install) handles the upgrade.
After upgrade, packages affected by xim-pkgindex#108 (cmake / mdbook /
ninja / node / xmake) need a re-install via `xlings uninstall <pkg> &&
xlings install <pkg>` to rebuild the corrupted payload — `xlings
install` alone won't re-patch already-installed payloads (correct
"installed payload is sacred" behavior, see installer.cppm:1080).
For the auto-recovery path (broken payload silently re-patched on
next install), see TODO in upstream tracking.
Verified locally:
- libxpkg unit tests 27/27 pass (post-fix, includes new
ApplyElfpatchAuto_LinuxRpathBeforeInterpreter regression test)
- E2E in isolated XLINGS_HOME: ninja install with new predicate +
new order produces a working binary, INTERP / RUNPATH correctly
set, `ninja --version` exits 0
libxpkg PR: openxlings/libxpkg#13 (merged, v0.0.37)
Analysis: docs/plans/2026-05-03-patchelf-order-bug-analysis.md (in libxpkg)
mcpplibs-index: mcpplibs/mcpplibs-index@bc4f740 registers v0.0.37
* ci: retrigger after xim-pkgindex#110 merge
* ci: retrigger after xim-pkgindex#111 (comprehensive namespace deps)
* ci: pin XIM_PKGINDEX_REF to stable commit (pre-#108) for fixture clone
The xlings E2E test fixture is dynamically cloned from xim-pkgindex
main at CI time. xim-pkgindex#108 (declare runtime deps on
cmake/mdbook/ninja/node/xmake) introduced bare-name deps that cascade
into ambiguity errors in E2E-05's dual-mount scenario, plus exposed
latent issues (gcc-specs-config not auto-activated, node not finding
libstdc++ at runtime).
Pin XIM_PKGINDEX_REF to commit 9cb6982 (PR #106 merge — last commit
before #108) until those cascading issues stabilize. This restores the
previously-passing E2E green state and decouples xlings's release CI
from in-flight xim-pkgindex churn.
Bump after xim-pkgindex deps stabilize: namespace prefixes universal
(continuation of #110/#111), runtime lib closure verified (gcc's
lib64 actually reaches consumers' RPATH), and gcc-specs-config
activation pathway sorted out.
Applied to all four workflows that clone the fixture: xlings-ci-linux,
xlings-ci-macos, xlings-ci-windows, release.
* ci: support commit-SHA refs in prepare_fixture_index.sh
`git clone --branch` only accepts branch/tag names, not SHAs. When
XIM_PKGINDEX_REF is set to a commit SHA (40-char hex), do a full clone
then `git checkout <sha>`. Branch/tag refs continue to use the depth-1
shallow clone path.
* ci: pin XIM_PKGINDEX_REF to stable commit (pre-#108) for fixture clone
Bind XIM_PKGINDEX_REF time-matched with BOOTSTRAP_XLINGS_VERSION=v0.4.8
(xlings released 2026-04-30 18:25 UTC, this commit lands 2026-04-30
17:43 UTC — last xim-pkgindex commit before 0.4.8 went live, so the
schema/conventions match what 0.4.8 expects).
Decouples xlings CI from in-flight xim-pkgindex churn (post #107: bare-
name deps + new runtime-dep declarations on prebuilts cascading into
E2E-05's dual-mount scenario). Bump in lockstep with BOOTSTRAP_XLINGS
once both stabilize.
Applied to all four workflows that clone the fixture: xlings-ci-linux,
xlings-ci-macos, xlings-ci-windows, release.
* ci: use full 40-char SHA for XIM_PKGINDEX_REF (regex match)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Reverse op order in `_patch_elf_executables` and `_patch_elf` fallback so `--set-rpath` runs before `--set-interpreter`. Workaround for a patchelf 0.18.0 corruption bug on compact ELFs (~≤ 280 KB).
Background
mcpp sandbox bring-up surfaced "ninja install in 0.4.12 sandbox produces a segfaulting binary". Earlier hypotheses (double-pass corruption, RPATH parsing) were narrowed by an empirical test matrix to single-pass op-order sensitivity in patchelf itself.
Reproduction matrix (patchelf 0.18.0 + fresh ninja 1.12.1)
Key: T5 vs T6 — patchelf processes interp before rpath INTERNALLY regardless of CLI argument order, so combined-command form is equivalent to T4. The only safe ordering is two separate invocations with rpath first.
Why reverse-order is safe
Cross-platform scope
Changes
Test plan
Ship plan
After merge:
Related