Skip to content

arm: i64 binop with an i64 param silently miscompiles — param read from wrong registers (both paths) #518

Description

@avrabe

Found by footgun/adversarial differential testing while verifying the v0.17.0 stack-fwd flip (the flip itself is clean — this is flip-INDEPENDENT and pre-existing, reproduces on v0.16.0 and the v0.17.0 tag). Surfacing as a real soundness bug.

Symptom

A leaf function taking an i64 param and doing any i64 binop against a constant returns the wrong value — the param is effectively dropped.

Minimal repro:

(module (func (export "t") (param i64) (result i64)
  (i64.add (local.get 0) (i64.const 5))))

t(7)wasmtime: 12, synth: 5 (the constant; param ignored). Same for i64.sub, i64.or, i64.mul, and large constants.

Root cause (from disasm) — BOTH paths affected, differently

Optimized path (--target cortex-m4, no --relocatable): the i64 param (AAPCS R0:R1) is read from R4:R5, which are pushed callee-saved with no r0→r4 / r1→r5 homing move:

stmdb sp!, {r4,r5,r6,r7,r8,lr}
movs r6,#5 ; movs r7,#0          ; const
adds.w r8, r4, r6                ; r8 = R4 + 5   <-- R4 is NOT the param
adc.w  r9, r5, r7                ; (also writes R9 = globals-reserved reg)
mov r0,r8 ; mov r1,r9

Relocatable / shipped path (--relocatable): the const-low is materialized into R1, clobbering the param's high half before the add-with-carry reads it:

movw r1,#5 ; movw r2,#0
adds   r3, r0, r1               ; low = param_low + 5  (correct)
adc.w  r1, r1, r2               ; high = 5 + 0 + carry  <-- reads clobbered R1, not param_high

Scope / not-this

  • i32 is clean: (i32.add (local.get 0) (i32.const 5))adds r5, r0, r4 reads R0 correctly. The bug is i64-param-specific.
  • i64 identity ((local.get 0)) is correct (R0:R1 passthrough), which is why the existing i64_lowering_doesnt_clobber_params CI oracle passes — it doesn't exercise an i64 param feeding an i64 binop-with-const.
  • Likely an i64 param-homing gap (the two 32-bit halves of an i64 param are not correctly homed/preserved before first use). May be adjacent to Read-before-write non-param local not zero-initialized (count_params misclassifies it as a param) #457 (count_params/param classification).

Suggested gate

An execution differential: compile (param i64)(result i64)(i64.<op> (local.get 0)(i64.const C)) for op ∈ {add,sub,mul,and,or,xor} and small+large C, run under unicorn with the param in R0:R1, diff vs wasmtime. Red today on both paths.

Byte-changing codegen fix ⇒ separate gated step (re-freeze + differential). Not urgent unless a real workload passes a bare i64 param to an exported leaf (falcon/gust use i32 entry params), but it is a silent miscompile of valid wasm.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions