fix: retry settlement on conflicting_nonce with exponential backoff#106
fix: retry settlement on conflicting_nonce with exponential backoff#106anansutiawan wants to merge 1 commit intoaibtcdev:mainfrom
Conversation
…al backoff When two concurrent x402 payments hit the relay within the same second, the relay may reject one with conflicting_nonce. This is a transient relay-side race condition, not a client error. Retry up to 3 times with 5s/10s/20s backoff before propagating the error to the caller. Closes aibtcdev#84 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
arc0btc
left a comment
There was a problem hiding this comment.
Adds transparent retry logic for conflicting_nonce settlements — good call. We process x402 payments through this middleware and have hit nonce contention when concurrent settlements race at the relay within the same second.
What works well:
- Retry is scoped only to
conflicting_nonce— all other failures (exceptions, non-retryable result codes) fall through immediately with no behavior change - The
settleAttempt < CONFLICTING_NONCE_MAX_RETRIESguard before incrementing means the array index is always in-bounds (0, 1, 2) log.warnon each retry attempt is the right call — this gives operators visibility into contention frequency without polluting the error log- Indentation and nesting of the refactored catch block is clean
[suggestion] Total backoff could exceed client HTTP timeouts (src/middleware/x402.ts)
Worst case: 5s + 10s + 20s = 35s before conflicting_nonce is propagated to the caller. Clients (or upstream proxies) with a 30s timeout will see a gateway timeout before the final retry completes. Consider documenting the maximum latency implication, or surfacing it as a configurable option:
// Total worst-case latency: 35s (5+10+20). Ensure any upstream HTTP timeout
// (proxy, CDN, client) exceeds this value, or reduce CONFLICTING_NONCE_MAX_RETRIES.
const CONFLICTING_NONCE_MAX_RETRIES = 3;
const CONFLICTING_NONCE_BACKOFF_MS = [5_000, 10_000, 20_000];
[nit] Unreachable fallback in backoff lookup (src/middleware/x402.ts)
The ?? 20_000 default is dead code — settleAttempt is always 0, 1, or 2 when this line executes (the condition settleAttempt < 3 guarantees it). No functional problem, just a slightly misleading guard:
const delayMs = CONFLICTING_NONCE_BACKOFF_MS[settleAttempt];
Code quality notes:
- The
while (true)+breakpattern is readable given the existing nesting depth. A helper likesettleWithRetry()would clean it up further but isn't necessary here. - The
// end while (retry loop)comment at the closing brace is mildly redundant — the variable names and structure make the intent clear.
Operational context:
We run a relay that processes concurrent settlements (x402-relay v1.29.0). We added proactive nonce reconciliation on the relay side to prevent the race that causes conflicting_nonce in the first place — this middleware-side retry is a good complementary defense layer for cases that slip through. The 5s first retry matches what we observe: nonce contention at the relay clears within 1–3 seconds, so 5s is a safe floor.
Summary
Fixes #84 — concurrent x402 payment settlements can hit nonce contention at the relay, causing both to fail with
conflicting_nonce. This is a transient relay-side race condition (not a client error), so the middleware should retry transparently before propagating the failure to the caller.Changes
verifier.settle()in awhileretry loop insrc/middleware/x402.tsconflicting_nonceinsettleResult.errorReason, retry up to 3 times with exponential backoff: 5s → 10s → 20sconflicting_noncefailures, thrown exceptions) fall through immediately — no change to existing behaviorlog.warnon each retry attempt for observabilityWhy this approach
The issue logs show the relay itself returns
conflicting_noncewhen two settlements land within the same second. The relay nonce contention clears quickly, so a short delay (5s first retry) recovers most concurrent cases without the client needing to rebuild the payment.Testing
No new tests are added (E2E tests require a live relay). The fix is isolated to the retry wrapper around the existing
settle()call. All pre-existing TypeScript errors insrc/endpoints/hashing/are unrelated and pre-date this change.🤖 Generated with Claude Code