Skip to content

perf(relay): defer post-commit dispatch and avoid verify clone#1453

Open
tlongwell-block wants to merge 3 commits into
mainfrom
eva/relay-perf-w8-arc-verify
Open

perf(relay): defer post-commit dispatch and avoid verify clone#1453
tlongwell-block wants to merge 3 commits into
mainfrom
eva/relay-perf-w8-arc-verify

Conversation

@tlongwell-block

@tlongwell-block tlongwell-block commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

Summary

Relay performance stack for the W1/W8 lane:

  • defer post-commit event dispatch out of the EVENT OK ack path
  • preserve audit-log backpressure by awaiting the bounded audit enqueue before detaching fan-out work
  • avoid deep-cloning accepted events for signature verification by sharing through Arc

Stack:

  1. 144f00bcperf(relay): defer post-commit event dispatch
  2. 605f7127fix(relay): preserve audit backpressure after W1
  3. c61b4c14perf(relay): share event with sig-verify task via Arc instead of deep clone

Correctness notes

  • Event acceptance/audit durability ordering is preserved: bounded audit enqueue remains awaited before detached post-commit work.
  • Detached work is limited to Redis publish/local fan-out/workflow/delivery metrics after DB commit and accepted event response path separation.
  • W8 changes allocation/ownership shape only; signature verification still runs before event persistence/dispatch.
  • Delivery-before-OK was never a NIP/repo correctness guarantee; W1 intentionally prioritizes sender-perceived OK latency by moving post-commit dispatch after the ack path. The observed local receiver delivery cost is named and accepted below.
  • Bounded post-commit worker/channel is intentionally left as a Tier-3 follow-up only if fresh-session/prod-like data recreates cross-pod p999 tail regression.

Validation

Code validation already run on this stack:

  • cargo fmt --check
  • cargo test -p buzz-relay handlers::event::tests
  • full cargo test -p buzz-relay
  • pre-push hooks passed before remote verification
  • remote branch verified at c61b4c149fe30f26e060370fea645412986adf93

Benchmark guardrail / tradeoff:

  • W1 raw ack p99 win across A/B runs: -11% to -28%; raw W1 cross-pod p999 tail regression observed and investigated.
  • 605f7127 restored audit backpressure; clean cross-pod p999 recovered to 5.60ms vs raw W1 15.64ms and baseline 6.98ms.
  • W8-on-fix c61b4c14: all runs accepted, 0 timeouts; ack p99 remains below baseline across all three protocols; W8 p99 deltas vs W1+fix are ≤0.12ms on 200-byte bodies; guardrail PASS.
  • Formal §4.13 delivery-ordering addendum: local first-delivery p99 rises +0.3–0.5ms same-pod because fan-out moved from pre-OK inline to post-commit spawned. This is the architectural cost of the ack p99 win (-14 to -23%); the two are the same lever. No repo NIP, TLA+ invariant, or Tamarin lemma binds delivery ordering relative to OK — this is a soft-consistency change, not a fence break. Cross-pod first-delivery p99 is within measurement noise as expected (receivers already routed through Redis). Trade accepted per bench readout.
  • Bench artifacts: RESEARCH/RELAY_PERF_BENCH_RUNS/w1fix-605f7127-* and RESEARCH/RELAY_PERF_BENCH_RUNS/w8-c61b4c14-*.

npub12gtutshhh76rx0jx697f32f9tffd4hhp3hx58fp4x6u4uemkm7sqf8f757 and others added 3 commits July 1, 2026 22:11
Co-authored-by: npub12gtutshhh76rx0jx697f32f9tffd4hhp3hx58fp4x6u4uemkm7sqf8f757 <5217c5c2f7bfb4333e46d17c98a9255a52dadee18dcd43a43536b95e6776dfa0@sprout-oss.stage.blox.sqprod.co>
Signed-off-by: npub12gtutshhh76rx0jx697f32f9tffd4hhp3hx58fp4x6u4uemkm7sqf8f757 <5217c5c2f7bfb4333e46d17c98a9255a52dadee18dcd43a43536b95e6776dfa0@sprout-oss.stage.blox.sqprod.co>
Co-authored-by: npub12gtutshhh76rx0jx697f32f9tffd4hhp3hx58fp4x6u4uemkm7sqf8f757 <5217c5c2f7bfb4333e46d17c98a9255a52dadee18dcd43a43536b95e6776dfa0@sprout-oss.stage.blox.sqprod.co>
Signed-off-by: npub12gtutshhh76rx0jx697f32f9tffd4hhp3hx58fp4x6u4uemkm7sqf8f757 <5217c5c2f7bfb4333e46d17c98a9255a52dadee18dcd43a43536b95e6776dfa0@sprout-oss.stage.blox.sqprod.co>
… clone

verify_event only needs a borrow; spawn_blocking only needs 'static.
Wrapping the event in an Arc avoids deep-cloning tags + up to 256 KB of
content on every ingest. After verification the Arc is uniquely held
(the verify task's clone is dropped on completion), so try_unwrap
returns the original event; the clone fallback is unreachable in
practice but keeps the code total.

W8 in PLANS/RELAY_PERF_OPTIMIZATION_PLAN.md; ruled GREEN (no invariant
touched) in RESEARCH/RELAY_PERF_CORRECTNESS.md. cargo test -p buzz-relay:
436 passed, 0 failed.

Co-authored-by: Tyler Longwell <tlongwell@block.xyz>
Signed-off-by: Tyler Longwell <tlongwell@block.xyz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant