x64: report an error instead of panicking on stack frames larger than i32::MAX#13783
x64: report an error instead of panicking on stack frames larger than i32::MAX#13783soumik15630m wants to merge 2 commits into
Conversation
| if rsp_relative.saturating_add(probe_margin) > i32::MAX as u64 | ||
| || rbp_relative > i32::MAX as u64 | ||
| { | ||
| return Err(crate::CodegenError::Unsupported( |
There was a problem hiding this comment.
Should probably use ImplLimitExceeded instead.
There was a problem hiding this comment.
Good call, thanks. ImplLimitExceeded doesn't carry a message though ...so is it okay?Happy to switch either way; want to make sure I'm not throwing away information you'd want to keep.
| insts | ||
| } | ||
|
|
||
| fn validate_frame_layout(frame_layout: &FrameLayout, guard_size: u32) -> CodegenResult<()> { |
There was a problem hiding this comment.
This validation should probably be done in generic code (maybe the caller of compute_frame_layout?). The FrameLayout type uses 32bit ints, so the 4GB stack frame size applies to all targets.
There was a problem hiding this comment.
Sure ... will push a commit soon
There was a problem hiding this comment.
Pushed a commit adding a generic check in compute_frame_layout.
I kept the x64-specific validate_frame_layout check as well, since it's
catching a different, tighter condition: x64 encodes these offsets as
signed 32-bit immediates (i32::MAX, not u32::MAX), and the inline
probestack loop rounds the frame size up to the next guard-page boundary
before encoding it - so a frame just under i32::MAX can get pushed
over it by that rounding. The original repro (~2.147GB) is under
u32::MAX (~4.295GB), so the generic check alone doesn't catch it -
only the x64-specific one does.
Happy to remove the x64-specific check if you'd rather keep this simpler
and generic-only, but wanted to flag that it currently exists to catch
a real case the generic check can't - unless other backends share x64's
signed-immediate constraint, in which case maybe it'd make sense to
generalize the tighter check too rather than keep it x64-only. Let me
know which direction you'd prefer.
48d5999 to
b94de39
Compare
|
Thanks for the PR, but personally I'm pretty wary of adding a parallel path that calculates information about a frame that duplicates what's already in each backend. Frame handling/layout is pretty precise and requires extremely careful handling, and I wouldn't be confident that it could be accurately mirrored elsewhere in the system. What I think should happen here is either (a) to propagate a result upwards at the time of a |
|
@alexcrichton 1. RSP arithmetic (the 2. Stack-relative memory addressing (
One thing worth flagging on option (a): Given that, (b) seems like the more proportionate direction, and I'm |
Summary
x64 codegen panics with
TryFromIntErrorwhen a function's stack frameis large enough that offsets relative to it no longer fit in a 32-bit
immediate. This affects three places: the inline stack-probe loop
(
guard_size * probe_count), clobber save/restore (RSP-relativeoffsets), and incoming-argument access (RBP-relative offsets via
SyntheticAmode::IncomingArg::finalize).This was reported downstream in
rust-lang/rustc_codegen_cranelift#1656, where a ~2GB stack allocation
(
[0; 536870787_usize], a[i32; N]array) triggers an ICE:thread 'optimize module ...' panicked at cranelift/codegen/src/isa/x64/inst/emit.rs:373:18:
guard_size * probe_count is too large to fit in a 32-bit immediate: TryFromIntError(PosOverflow)
What this does
Adds
ABIMachineSpec::validate_frame_layout, a new trait method with ano-op default, overridden for x64. It's called from
VCode::emitrightafter
compute_frame_layout, before any offset encoding takes place,and checks:
On overflow, this returns
CodegenError::Unsupportedinstead ofpanicking. Only x64 overrides the new trait method; other backends are
unaffected (
aarch64/riscv64/s390x/pulleyneeded a one-line?each to propagate
emit's now-fallible return type, no behaviorchange).
Testing
isa/x64/abi.rscovering: a normal frame(accepted), an oversized
fixed_frame_storage_size(rejected), anoversized
tail_args_sizevia the RBP-relative path (rejected), anda frame that's under
i32::MAXon its own but pushed over by theguard-page rounding margin (rejected) — this last case specifically
reproduces the original bug.
Known follow-up (not included here)
bjorn3's comment mentions "all stack load/store instructionlowerings" more broadly. I traced and covered the two offset-encoding
paths that are actually reachable from the repro (
gen_clobber_save/gen_clobber_restoreandSyntheticAmode::IncomingArg::finalize), butI have not exhaustively audited every stack-relative offset computation
in the x64 backend for the same class of narrowing. Happy to widen this
if there's a specific path you'd like checked.
Separately,
rustc_codegen_craneliftcurrently panics when it receivesany
ErrfromContext::compileatsrc/base.rs:224rather than converting it to a rustc diagnostic. I have open a PR rust-lang/rustc_codegen_cranelift/pull/1669 there to handle
CodegenError::Unsupportedthe same waythe existing
CodegenError::Verifierarm already does.